What we solve Digitate’s empowers organizations to transform their operations with intelligence, insights, and actions. Platform Overview Products ignio AIOps Redefining IT operations with AI and automation ignio Observe Cloud Visibility and Cost Optimization Business Health Monitoring IT Event Management ignio AI.Workload Management Enabling predictable, Agile and Silent batch operations in a closed-loop solution Business SLA Prediction ignio AI.ERPOps End-to-end automation for incidents and service requests in SAP IDoc Management for SAP ignio AI.Digital Workspace Autonomously detect, triage and remediate endpoint issues ignio Cognitive Procurement AI-based analytics to improve Procure-to-Pay effectiveness ignio AI.Assurance Transform software testing and speed up software release cycles

What we do Digitate helps enterprises improve the resilience and agility of their IT and business operations with our SaaS – based platform . Platform Overview Platform ignio™ Platform ignio™, Digitate’s SaaS-based platform for autonomous operations, combines observability and AIOps capabilities to solve operational challenges Industries Autonomous IT Solutions for the Modern Industry BFSI Retail Healthcare & Life Sciences Travel & Hospitality Consumer Packaged Goods AI Agents ignio’s AI agents, with their ability to perceive, reason, act, and learn deliver measurable business value and transform IT operations. AI Agent for IT Event Management AI Agent for Incident Resolution AI Agent for Cloud Cost Optimization AI Agent for Proactive Problem Management AI Agent for Business SLA Predictions

Looking for Support? Access your instances, manage tasks and explore self-service help all in one place. Security and Privacy Global Compliance Agreements Support Service and Support Discover what top industry analysts have to say about Digitate Community Connect with other customers to share tips, exchange resources Pricing Every ignio capability is priced on a usage metric tightly correlated to the value it delivers Trust Center Digitate policies on security, privacy, and licensing Documentation Find answers to your technical questions and learn how to use Digitate products

Who we are At Digitate, we’re committed to helping enterprise companies, realize autonomous operations. Integration Channel Partner Technology Partner Azure Marketplace Company Leadership We’re committed to helping enterprise companies realize autonomous operations Newsroom Explore the latest news and information about Digitate Partners Grow your business with our Elevate Partner program Academy Evolve your skills and get certified Contact Us Get in touch or request a demo

What we solve Digitate’s empowers organizations to transform their operations with intelligence, insights, and actions. Platform Overview Products ignio AIOps Redefining IT operations with AI and automation ignio Observe Cloud Visibility and Cost Optimization Business Health Monitoring IT Event Management ignio AI.Workload Management Enabling predictable, Agile and Silent batch operations in a closed-loop solution Business SLA Prediction ignio AI.ERPOps End-to-end automation for incidents and service requests in SAP IDoc Management for SAP ignio AI.Digital Workspace Autonomously detect, triage and remediate endpoint issues ignio Cognitive Procurement AI-based analytics to improve Procure-to-Pay effectiveness ignio AI.Assurance Transform software testing and speed up software release cycles

What we do Digitate helps enterprises improve the resilience and agility of their IT and business operations with our SaaS – based platform . Platform Overview Platform ignio™ Platform ignio™, Digitate’s SaaS-based platform for autonomous operations, combines observability and AIOps capabilities to solve operational challenges Industries Autonomous IT Solutions for the Modern Industry BFSI Retail Healthcare & Life Sciences Travel & Hospitality Consumer Packaged Goods AI Agents ignio’s AI agents, with their ability to perceive, reason, act, and learn deliver measurable business value and transform IT operations. AI Agent for IT Event Management AI Agent for Incident Resolution AI Agent for Cloud Cost Optimization AI Agent for Proactive Problem Management AI Agent for Business SLA Predictions

Analyst Reports Discover what the top industry analysts have to say about Digitate Blogs Explore Insights on Intelligent Automation from Digitate experts ROI Get Insights from the Forrester Total Economic Impact™ study on Digitate ignio Case Studies Learn how Digitate ignio helped transform the Walgreens Boots Alliance Trust Center Digitate policies on security, privacy, and licensing e-Books Digitate ignio™ eBooks Provide Insights into Intelligent Automation Infographics Discover the Capabilities of ignio™’s AI Solutions Reference Guides Guides cover AIOps and SAP automation examples, use cases, and selection criteria White Papers and POV Discover ignio White papers and Point of view library Webinars & Events Explore our upcoming and recorded webinars & events

Who we are At Digitate, we’re committed to helping enterprise companies, realize autonomous operations. Integration Channel Partner Technology Partner Azure Marketplace Company Leadership We’re committed to helping enterprise companies realize autonomous operations Newsroom Explore the latest news and information about Digitate Partners Grow your business with our Elevate Partner program Academy Evolve your skills and get certified Contact Us Get in touch or request a demo

What we solve Digitate’s empowers organizations to transform their operations with intelligence, insights, and actions. Platform Overview Products ignio AIOps Redefining IT operations with AI and automation ignio Observe Cloud Visibility and Cost Optimization Business Health Monitoring IT Event Management ignio AI.Workload Management Enabling predictable, Agile and Silent batch operations in a closed-loop solution Business SLA Prediction ignio AI.ERPOps End-to-end automation for incidents and service requests in SAP IDoc Management for SAP ignio AI.Digital Workspace Autonomously detect, triage and remediate endpoint issues ignio Cognitive Procurement AI-based analytics to improve Procure-to-Pay effectiveness ignio AI.Assurance Transform software testing and speed up software release cycles

What we do Digitate helps enterprises improve the resilience and agility of their IT and business operations with our SaaS – based platform . Platform Overview Platform ignio™ Platform ignio™, Digitate’s SaaS-based platform for autonomous operations, combines observability and AIOps capabilities to solve operational challenges Industries Autonomous IT Solutions for the Modern Industry BFSI Retail Healthcare & Life Sciences Travel & Hospitality Consumer Packaged Goods AI Agents ignio’s AI agents, with their ability to perceive, reason, act, and learn deliver measurable business value and transform IT operations. AI Agent for IT Event Management AI Agent for Incident Resolution AI Agent for Cloud Cost Optimization AI Agent for Proactive Problem Management AI Agent for Business SLA Predictions

Analyst Reports Discover what the top industry analysts have to say about Digitate Blogs Explore Insights on Intelligent Automation from Digitate experts ROI Get Insights from the Forrester Total Economic Impact™ study on Digitate ignio Case Studies Learn how Digitate ignio helped transform the Walgreens Boots Alliance Trust Center Digitate policies on security, privacy, and licensing e-Books Digitate ignio™ eBooks Provide Insights into Intelligent Automation Infographics Discover the Capabilities of ignio™’s AI Solutions Reference Guides Guides cover AIOps and SAP automation examples, use cases, and selection criteria White Papers and POV Discover ignio White papers and Point of view library Webinars & Events Explore our upcoming and recorded webinars & events

Who we are At Digitate, we’re committed to helping enterprise companies, realize autonomous operations. Integration Channel Partner Technology Partner Azure Marketplace Company Leadership We’re committed to helping enterprise companies realize autonomous operations Newsroom Explore the latest news and information about Digitate Partners Grow your business with our Elevate Partner program Academy Evolve your skills and get certified Contact Us Get in touch or request a demo

Operationalizing Generative AI for Scalable, Production-Ready Solutions

What we solve

Digitate’s SaaS AIOps empowers organizations to transform their operations with intelligence, insights, and actions.

ignio Products

Cognitive Procurement

Assurance

Platform

ignio™ Platform

ignio™, Digitate’s SaaS-based platform for autonomous operations, combines observability and AIOps capabilities to solve operational challenges

Agents

AI Agents

ignio’s AI agents, with their ability to perceive, reason, act, and learn deliver measurable business value and transform IT operations.

Industries?

Explore purpose-built solutions for your industry’s evolving challenges

View all Industries

Industries

BFSI

AI-powered operations and automation for resilient, efficient banking and financial services.

Travel & Hospitality

Enhance travel and hospitality performance with AI, improving service quality and operational resilience

Retail

Transform retail operations with AI, automation, and insights for seamless customer experiences

Consumer Packaged Goods

Drive smarter CPG value chains with AI-powered automation and real-time consumer insights

Healthcare & Life Sciences

Drive resilient life sciences operations with automation, analytics, and regulatory-ready insights

Looking for something?

Discover how we empower customer success and explore our latest eBooks, white papers, blogs, and more.

Blogs

Podcasts

Customers Success

IDC MarketScape Report

Resources

Analyst Reports

Discover what top industry analysts have to say about Digitate

ROI

Get insights from the Forrester Total Economic Impact™ study on Digitate ignio

Webinars & Events

Explore our upcoming and recorded webinars & events

Infographics

Discover the capabilities of ignio™’s AI solutions

Blogs

Explore insights on intelligent automation from Digitate experts

Podcasts

Explore our upcoming and recorded podcast

e-Books

Digitate ignio™ eBooks provide insights into intelligent automation

Case Studies

Learn how businesses overcame key AI-driven automation issues

Reference Guides

Guides cover AIOps and SAP automation examples, use cases, criteria

White Papers and POV

A library of in-depth insights and actionable strategies

What are our expectations from a well operationalized GenAI Application? The asks are like any other predictive or traditional ML application like is the system accurate, efficient, effective, safe, reliable, and scalable. In this blog, we discuss some key operationalization aspects that need to be considered to build production ready end-to-end LLM model pipelines.

1. Designing the data pipeline

It is important to identify and map the data sources which will be input to the LLM. These could be documents, code, structured data etc. The data pipeline needs to be built including ingestion and preprocessing of data. This data can be then used as part of context to the LLM. For certain use cases one needs to match the end user input at inference time to a part of existing documents to retrieve the best matched relevant chunks of data. These chunks are then added to the context given to the LLM along with original user question and prompt. Vector databases are best suited for such pipelines. Examples of vector databases include Pinecone, Weaviate etc.

2. Dealing with ambiguity of LLMs

The output of LLMs is stochastic in nature and may differ even with same input, prompts and context. This may not be palatable in certain use cases and needs to be managed carefully by setting the hyperparameters (such as temperature and top_k) of the model. Another way is to engineer workarounds and design workflows to handle such scenarios in the application. This is one of the key differences between handling the output of traditional ML and LLMs!

3. Selecting the model

The model that is best suited for the task needs to be selected. Generally, we need models for two purposes, embedding generation and query completion via LLM. Various choices are available in both cases. Factors impacting model selection include embedding dimension, sequence length, cost, infrastructure needs, latency and the key benchmark metric required for the task. Models need to be studied for their benchmarks and then selected for the task. Always refer to the leaderboard. For example, we can refer to the Massive Text Embedding Benchmark (MTEB) leaderboard to select the best embedding model. And check for TruthfulQA benchmark for Q and A specific use case. It is a benchmark designed to evaluate the truthfulness of language models in generating answers to diverse questions.

Also, model selection needs to be revisited periodically as newer models keep popping up every now and then! The model’s performance also needs to be reviewed periodically as we will see in the next section.

Some models can be hosted by the enterprise itself (on self-owned infra) and then served while some others are already hosted and need to be consumed via an API. For example, models are available via OpenAI, Huggingface, Cohere and also via Cloud providers such as AWS, Azure and GCP. These are available via API calls. Open-Source models such as Falcon and Llama are available if the need is to host models on self-owned infra or one needs to continuously do extensive fine tuning for various tasks and maintain local model versions. Many ML Ops libraries provide such features (MLFLOW, Cloud providers etc.).

Model size is also a factor to be considered. There is a tradeoff between model size and accuracy. The higher the parameters, the greater the size and hence more the memory /compute subsequently increasing costs This impacts latency as well.

4. Selecting the design pattern

The architecture pattern needs to be decided for the LLM application depending on the type of use case, data sources (internal or external), etc. Options include simple inference calls to model with minimal prompts where they utilize their existing knowledge or alternative implementation using the in-context learning ability of the LLMs such as Zero and Few shot learning. Another option is to implement a Retrieval Augmented Generation architecture which uses internal documents to limit the context given for a single query to the model. This pattern also uses vector stores and is implemented as a pipeline.

In multi-step applications, AI Assistants or AI agents can be used. They can perform tasks such as web search, web browser-based actions, invoke code interpreters &SQL executors, etc. These tasks can be done in conjunction with the LLM output and allow the application to perform a series of automated tasks.

5. Experimenting with prompts

This is one of the most vital aspects of getting the right output. Prompt evaluation involves checking if the model understands the examples given in the prompt or does it overfit the Few shot examples! Further, prompt versioning and prompt optimization are required to have a good system output. Various tools such as MLFLOW prompt Engineering UI or Azure Prompt flow can be used to perform experiments, iterate, and tune the prompts.

Prompt tuning is another technique to be used only with open-source models (not APIs). This involves programmatically changing the embedding of the prompt and providing them as input to the model. This is an expensive task as each task requires a separate model.

6. LLMOps orchestration tooling

This is one of the most critical components in building the end-to-end pipeline. This tool acts as a central orchestrating tool interacting with various other components and controlling the flow of actions and data. Various LLMOps tooling are available in the market. Few are open source, and many are paid versions. Important ones among them are Langchain, Auto-GPT, LlamaIndex, AutoAI APIs. Decisions should be made based on ease of use and type of features (such as playground, extensible APIs, quick development of changes in industry).

7. Finetuning the models

When the enterprise has good amount of data in the form of “provided input” vs “to be generated output”, it can be fed to the model for finetuning it. Then the model will learn based on the input given. Finetuning is the process when the weights of the model are updated by mini training certain layers of the neural network-based architecture for additional data only. The new model can then be hosted and used for subsequent inference. The disadvantage is higher cost of fine tuning and then using that model for inference. The advantage is one can feed specific data and make the model behave accordingly.

But what if we have only a few examples to show to the model? In such cases, we use prompting techniques such as few-shot learning and make use of the in-context learning. Here the weights of the model remain unchanged and in context instruction is followed by the model. But this is limited by the token length of the model. Also, beyond a point, the longer the input context, more the chance of the model forgetting a part of it! So, consider fine tuning in such scenarios.

Engineering wise also, in case of self-hosted models, fine tuning involves implementation of lots of optimization techniques such a Quantization ,Low Rank Adaptation(LORA) and Parameter-Efficient Finetuning (PEFT).

8. Planning for deployment/inference

Once the model is ready to be deployed in production for its end use, design of scalable application architecture comes into play. Model size, user load on the application and latency and throughput requirements determines the system infrastructure (compute and memory). For example, we could select a GPU cluster to load a 70 billion parameter size model!

These factors also influence the choice of serving application architecture. The application could include simple python script calls or full-fledged server applications (such as Django and Gunicorn). Most providers such as Azure or Databricks have an option of hosted model serving and horizontal scalability. Many frameworks such as Ray-serve help us achieve optimized inference for self-hosted applications as well.

9. Review, evaluation, and governance

Once the application is developed, certain metrics need to be checked, and the cycle needs to be iterated till we have the desired numbers. Quality metrics for evaluating functional aspects include “exact-match” for Q and A models, ROUGE (Recall-Oriented Understudy for Gisting Evaluation) for summarization, etc. We can also evaluate the model output using another evaluator LLM. Some metrics that can be measured using this method are answer-correctness, answer-faithfulness, etc. Langchain and MLFLOW are some tools that help in the framework for evaluating model performance. Testing datasets do play a key role in automated measurement. Human feedback is important in evaluation as well. In some cases where LLM is used for prediction/classification, traditional metrics do apply.

Governance of the models is vital in making sure they are not misused or pose a threat to processes or people. Steps should be taken to eliminate bias and toxicity. Also, relevant guardrails for quality, security, transparency, equity in response etc. should be in place to mitigate risks arising from LLM use. These guardrails can be implemented using prompt engineering, model hyperparameters, type of model, explicit checks, limiting the input context etc.

10. Monitoring of models

For model observability, it is important to log and track various parameters of the application after its implementation. Such as logging the number of requests, response time, token usage, costs, CPU/GPU, and memory metrics, etc. This helps in keeping the system up and the infrastructure optimized. Also logging the entire chain from input (prompt, context) to response output leads to transparency, auditability and provides functional insights on the performance of the model.

The metrics tracked during development need to be periodically monitored using the test datasets and the new inference data available. Any drift in performance needs to be flagged using an alerting mechanism. Alerts beyond certain thresholds are an effective way to point to aberrations.

Closing notes

It is important for enterprises to have a mechanism in place to ensure diligent implementation of the above mentioned steps. While this is a multi-team effort (devops,infra,business etc.) a centralized and homogenous approach to manage the lifecycle of all LLM applications, right from pilot to post-production, is recommended. Finally, as GenAI space evolves rapidly, it is important to evolve these steps in line, to reap maximum and timely benefits of GenAI.

Who we are​

Operationalizing Generative AI to build production-ready solutions

Table of Contents

Recent Blogs

1. Designing the data pipeline

2. Dealing with ambiguity of LLMs

3. Selecting the model

4. Selecting the design pattern

5. Experimenting with prompts

6. LLMOps orchestration tooling

7. Finetuning the models

8. Planning for deployment/inference

9. Review, evaluation, and governance

10. Monitoring of models

Closing notes

Sarang Varhadpande

Get started with Digitate

Who we are