#DocumentAI | Explore Tumblr posts and blogs

ai-hax · 24 days ago

Link

0 notes

govindhtech · 7 months ago

Text

Google Cloud Document AI Layout Parser For RAG pipelines

Google Cloud Document AI

One of the most frequent challenges in developing retrieval augmented generation (RAG) pipelines is document preparation. Parsing documents, such as PDFs, into digestible parts that can be utilized to create embeddings frequently calls for Python expertise and other libraries. In this blog post, examine new features in BigQuery and Google Cloud Document AI that make this process easier and walk you through a detailed sample.

Streamline document processing in BigQuery

With its tight interaction with Google Cloud Document AI, BigQuery now provides the capability of preprocessing documents for RAG pipelines and other document-centric applications. Now that it’s widely available, the ML.PROCESS_DOCUMENT function can access additional processors, such as Document AI’s Layout Parser processor, which enables you to parse and chunk PDF documents using SQL syntax.

ML.PROCESS_DOCUMENT’s GA offers developers additional advantages:

Increased scalability: The capacity to process documents more quickly and handle larger ones up to 100 pages

Simplified syntax: You can communicate with Google Cloud Document AI and integrate them more easily into your RAG workflows with a simplified SQL syntax.

Document chunking: To create the document chunks required for RAG pipelines, access to extra Document AI processor capabilities, such as Layout Parser,

Specifically, document chunking is a crucial yet difficult step of creating a RAG pipeline. This procedure is made simpler by Google Cloud Document AI Layout Parser. Its examine how this functions in BigQuery and soon illustrate its efficacy with a real-world example.

Document preprocessing for RAG

A large language model (LLM) can provide more accurate responses when huge documents are divided into smaller, semantically related components. This increases the relevance of the information that is retrieved.

To further improve your RAG pipeline, you can generate metadata along with chunks, such as document source, chunk position, and structural information. This will allow you to filter, refine your search results, and debug your code.

A high-level summary of the preparation stages of a simple RAG pipeline is given in the diagram below:Image credit to Google cloud

Build a RAG pipeline in BigQuery

Because of their intricate structure and combination of text, numbers, and tables, financial records such as earnings statements can be difficult to compare. Let’s show you how to use Document AI’s Layout Parser to create a RAG pipeline in BigQuery for analyzing the Federal Reserve’s 2023 Survey of Consumer Finances (SCF) report. You may follow along here in the notebook.

Conventional parsing methods have considerable difficulties when dealing with dense financial documents, such as the SCF report from the Federal Reserve. It is challenging to properly extract information from this roughly 60-page document because it has a variety of text, intricate tables, and embedded charts. In these situations, Google Cloud Document AI Layout Parser shines, efficiently locating and obtaining important data from intricate document layouts like these.

The following general procedures make up building a BigQuery RAG pipeline using Document AI’s Layout Parser.

Create a Layout Parser processor

Make a new processor in Google Cloud Document AI of the LAYOUT_PARSER_PROCESSOR type. The documents can then be accessed and processed by BigQuery by creating a remote model that points to this processor.

Request chunk creation from the CPU

SELECT * FROM ML.PROCESS_DOCUMENT( MODEL docai_demo.layout_parser, TABLE docai_demo.demo, PROCESS_OPTIONS => ( JSON ‘{“layout_config”: {“chunking_config”: {“chunk_size”: 300}}}’) );

Create vector embeddings for the chunks

Using the ML.GENERATE_EMBEDDING function, its will create embeddings for every document chunk and write them to a BigQuery table in order to facilitate semantic search and retrieval. Two arguments are required for this function to work:

The Vertex AI embedding endpoints are called by a remote model.

A BigQuery database column with information for embedding.

Create a vector index on the embeddings

Google Cloud build a vector index on the embeddings to effectively search through big sections based on semantic similarity. In the absence of a vector index, conducting a search necessitates comparing each query embedding to each embedding in your dataset, which is cumbersome and computationally costly when working with a lot of chunks. To expedite this process, vector indexes employ strategies such as approximate nearest neighbor search.

CREATE VECTOR INDEX my_index ON docai_demo.embeddings(ml_generate_embedding_result) OPTIONS(index_type = “TREE_AH”, distance_type = “EUCLIDIAN” );

Retrieve relevant chunks and send to LLM for answer generation

To locate chunks that are semantically related to input query, they can now conduct a vector search. In this instance, inquire about the changes in average family net worth throughout the three years covered by this report.

SELECT ml_generate_text_llm_result AS generated, prompt FROM ML.GENERATE_TEXT( MODEL docai_demo.gemini_flash, ( SELECT CONCAT( ‘Did the typical family net worth change? How does this compare the SCF survey a decade earlier? Be concise and use the following context:’, STRING_AGG(FORMAT(“context: %s and reference: %s”, base.content, base.uri), ‘,\n’)) AS prompt, FROM VECTOR_SEARCH( TABLE docai_demo.embeddings, ‘ml_generate_embedding_result’, ( SELECT ml_generate_embedding_result, content AS query FROM ML.GENERATE_EMBEDDING( MODEL docai_demo.embedding_model, ( SELECT ‘Did the typical family net worth increase? How does this compare the SCF survey a decade earlier?’ AS content ) ) ), top_k => 10, OPTIONS => ‘{“fraction_lists_to_search”: 0.01}’) ), STRUCT(512 AS max_output_tokens, TRUE AS flatten_json_output) );

And have an answer: the median family net worth rose 37% between 2019 and 2022, a substantial rise over the 2% decline observed over the same time a decade earlier. If you look at the original paper, you’ll see that this information is located throughout the text, tables, and footnotes areas that are typically difficult to interpret and draw conclusions from together!

Although a simple RAG flow was shown in this example, real-world applications frequently call for constant updates. Consider a situation in which a Cloud Storage bucket receives new financial information every day. Consider using Cloud Composer or BigQuery Workflows to create embeddings in BigQuery and process new documents incrementally to keep your RAG pipeline current. When the underlying data changes, vector indexes are automatically updated to make sure you are always querying the most recent data.

Read more on Govindhtech.com

#DocumentAI #AI #RAGpipelines #BigQuery #RAG #GoogleCloudDocumentAI #LLM #cloudcomputing #News #Technews #Technology #Technologynews #Technologytrends #govindhtech

1 note · View note

piazzaconsultinggroup · 1 year ago

Text

Revolutionizing Document Management with PCG’s Advanced Document AI Services

PCG IT Consulting Company proudly presents its cutting-edge Document AI Services, designed to revolutionize document handling and management processes across various sectors. By harnessing the power of advanced AI and OCR technologies, our services optimize the extraction, processing, and management of document data, ensuring accuracy, speed, and compliance.

Our Document AI solutions are tailored to enhance efficiency in industries such as legal, financial, and education. These services automate tedious manual tasks, reduce the risk of errors, and secure sensitive data, allowing businesses to focus on strategic decisions and core operations. With PCG’s Document AI Services, organizations can quickly access and analyze critical information, improving response times and decision-making processes.

Experience the benefits of reduced operational costs, enhanced data accessibility, and improved compliance with regulations. Our solutions are not just about technology but about transforming your business processes to be more agile and competitive in a digital-first world. Partner with PCG IT Consulting Company to empower your organization with the tools it needs to succeed in the rapidly evolving business landscape.

#Documentai #Artificial intelligence #Intelligentautomation

0 notes

sun-technologies · 2 years ago

Text

Maximizing Efficiency with Contract AI and O2C Automation

Contract AI (Contract Artificial Intelligence):

Contract AI refers to using machine learning and artificial intelligence (AI) technologies to streamline and optimize the management of contracts throughout their lifecycle. It involves the application of AI algorithms to analyze, extract, and interpret data from contracts, automate contract-related processes, and improve contract risk management and compliance. Contract AI aims to enhance efficiency, reduce manual labor, mitigate risks, and facilitate better decision-making in managing legal agreements and contracts within organizations.

O2C Automation (Order-to-Cash Automation):

O2C Automation, also known as Order-to-Cash Automation, implements automated processes and technologies to optimize and streamline the entire order-to-cash cycle in a business. This cycle encompasses all the steps in fulfilling customer orders, from order initiation to payment receipt. O2C Automation typically involves robotic process automation (RPA), workflow automation, and data analytics to improve order processing efficiency, reduce errors, enhance cash flow management, and provide a better customer experience. It plays an important role in financial and customer relationship management within organizations, particularly in the BFSI (Banking, Financial Services, and Insurance) sector. Explain their significance in the BFSI (Banking, Financial Services, and Insurance) sector.

Discuss the need for modernization in BFSI

Modernization in the Banking, Financial Services, and Insurance (BFSI) sector is imperative due to the convergence of technological innovation, shifting customer expectations, stringent regulatory demands, and heightened competition. BFSI institutions must embrace digital transformation to stay relevant and competitive, providing seamless and personalized services while adhering to evolving regulatory frameworks. Modernization enhances operational efficiency, enables robust risk management, fosters innovation, and facilitates cost savings, ultimately ensuring that organizations can adapt to a rapidly changing financial landscape and deliver value to their customers and shareholders.

Benefits of Contract AI in BFSI

Improved contract management

Enhanced compliance and risk management

Faster contract review and approval

Cost savings through automation

Case studies of BFSI companies benefiting from Contract AI

Benefits of O2C Automation in BFSI

Streamlined order processing

Improved cash flow management

Enhanced customer experience

Reduced errors and fraud prevention

Real-world examples of O2C Automation success stories in BFSI

Challenges in Implementing Contract AI and O2C Automation

Certainly, here are strategies to overcome the challenges associated with implementing Contract AI and Order-to-Cash (O2C) Automation in the Financial Services, Banking, and Insurance (BFSI) sector:

Data Privacy and Security Concerns:

Data Encryption: Implement robust data encryption techniques to protect sensitive contract and financial data in transit and at rest. Use industry-standard encryption protocols to secure data.

Access Control: Implement strict access controls and role-based permissions to make sure that only authorized personnel can access sensitive information. Regularly audit and monitor access to identify any unauthorized activity.

Data Privacy Compliance: Ensure your systems and processes comply with data privacy regulations such as GDPR. Conduct regular privacy impact assessments to mitigate and identify potential risks.

Integration with Legacy Systems:

APIs and Middleware: Invest in middleware solutions or develop APIs to bridge the gap between modern Contract AI and O2C Automation systems and legacy systems. This allows for smoother data exchange and process integration.

Gradual Migration: Consider a phased approach to integration, where you gradually migrate specific processes or functions to the new system, reducing the immediate burden on legacy systems.

Customization: Tailor integration solutions to the specific needs of your organization. Custom development may be necessary to ensure seamless connectivity.

Staff Training and Change Management:

Comprehensive Training: Provide comprehensive training programs for employees who will be using the new Contract AI and O2C Automation systems. Training should cover both the technical aspects and the benefits of the new systems.

Change Champions: Identify and train "change champions" within your organization—individuals who can champion the adoption of new technologies and processes and help colleagues adapt.

Continuous Learning: Foster a culture of adaptation and continuous learning. Encourage employees to keep up with technology trends and actively seek feedback to improve processes.

Regulatory Compliance Challenges:

Regulatory Expertise: Employ or consult with regulatory experts who have a deep understanding of the BFSI sector. They can assist you in navigating complex compliance requirements and keeping your systems up to date.

Automated Compliance Monitoring: Utilize automation and AI for real-time monitoring of regulatory compliance. Implement compliance checks and alerts within your Contract AI and O2C Automation systems.

Regular Audits: Conduct regular audits of your systems and processes to ensure compliance. Document compliance efforts to demonstrate due diligence in case of regulatory inquiries.

Strategies to Overcome These Challenges:

Cross-Functional Teams: Form cross-functional teams involving IT, legal, compliance, and business units to collaboratively address challenges and ensure a holistic approach to implementation.

Pilot Programs: Begin with small-scale pilot programs to test the effectiveness of Contract AI and O2C Automation solutions while identifying and addressing issues on a smaller scale before full-scale deployment.

Third-Party Expertise: Consider partnering with experienced technology vendors or consultants who specialize in BFSI automation. They can provide valuable guidance and insights throughout the implementation process.

Continuous Improvement: Implement continuous improvement practices to refine processes, enhance data security, and adapt to changing regulations and technology advancements over time.

Communication: Maintain open and transparent communication with stakeholders at all stages of implementation. Address concerns and provide regular updates to build confidence in the new systems.

Documentation: Keep detailed records of your implementation process, including decisions, changes, and compliance efforts. This documentation can be invaluable for audits and ongoing improvement.

Technologies Behind Contract AI and O2C Automation

Natural Language Processing (NLP)

Machine Learning and Predictive Analytics

Robotic Process Automation (RPA)

Blockchain for contract security

Cloud computing for scalability

Steps to Implement Contract AI and O2C Automation

Assessing your organization's readiness

Selecting the right technology and vendors

Developing a phased implementation plan

Training and upskilling your workforce

Measuring and optimizing the implementation's success

Future Trends and Innovations

AI-driven chatbots for customer inquiries

Smart contracts and decentralized finance (DeFi)

Predictive analytics for financial forecasting

AI in risk assessment and fraud detection

Conclusion and Future Outlook

In conclusion, Contract AI and Order-to-Cash (O2C) Automation stand as transformative forces in the Banking, Financial Services, and Insurance (BFSI) sector, poised to redefine how contracts are managed and financial processes are streamlined. Despite the challenges, these technologies offer the promise of heightened efficiency, accuracy, compliance, and customer satisfaction. As BFSI organizations navigate the complexities of data privacy, legacy system integration, staff adaptation, and regulatory adherence, they must recognize that embracing these innovations is not an option but a necessity for remaining competitive and resilient in an ever-evolving financial landscape. The successful implementation of Contract AI and O2C Automation holds the potential to revolutionize the BFSI sector, shaping a future where financial operations are faster, more secure, and aligned with the demands of a digital-first world.

#DocumentAI #DocumentAutomation #DocumentIntelligence #DataExtraction #AutomationTechnology

0 notes

carloskaplan · 22 days ago

Text

Agnès Varda en Les glaneurs et la glaneuse (2000)

#agnès varda #cine #documentais

12 notes · View notes

rickchung · 1 year ago

Text

Seeking Mavis Beacon (dir. Jazmin Jones) x DOXA 2024. (via The Independent)

Two women investigate the disappearance of the iconic real-life model behind the popular 1980s educational software while raising pertinent issues concerning our relationship to technology. Jones and her friend, Carribean-American video artist Olivia McKayla Ross, create their own portrait of their image of "Mavis Beacon" while blending the facts they uncover with their own fictional interpretations. Their exercise feels like an experimental detour interrogating our cultural fascinations (like true crime or conspiratorial fare) as they figure out who Renee L’Esperance, a Haitian immigrant model, perfume saleswoman, and the face behind Mavis Beacon Teaches Typing, really is and why she vanished from public life entirely.

Screening as part of the 2024 DOXA Documentary Film Festival at The Cinematheque on May 12.

44 notes · View notes

devoted1989 · 1 year ago

Text

Documentaries about veganism and animal rights.

Veganville

Infinity and Back

The Cove

Cowspiracy

Blackfish

Hugh’s Big Fish Fight

73 Cows

Okra

Dominion

Earthlings

The Game Changers

Forks Over Knives

Love and Let Live

Running for Good

Before the Flood

Vegucated

Seaspiracy

Eating Animals

Milked

What the Health

Eating our Way to Extinction

Shark Water

Fat Sick & Nearly Dead

The Ghosts in our Machine

The Witness

Peaceable Kingdom: The Journey Home

7 Days Mini-Documentary

The Land of Hope and Glory

Speciesism: The Movie

A Prayer for Compassion

The Milk System

Eating you Alive

HOPE

Unity

PlantPure Nation

Food Choices

Carnage: Swallowing the Past

Planteat

Vegan: Everyday Stories

Farm to Fridge

Meet Your Meal

Swine

From the Ground up

Vegan 2020

The End of Medicine

The Invisible Vegan

My Octopus Teacher

Breaking the Chain

The Animal People

Long Gone Wild

Sled Dogs

Fed Up

Food, Inc.

I am an Animal

Fast Food Nation

Super Size Me

Veganville

Image found on Pinterest.

#vegan #veganism #animal rights #factory farming #factory farms #vegan documentaries #animal rights documentaies

5 notes · View notes

cinevisto32 · 2 years ago

Text

A bigger splash (1967)

#david hockney #documentais #netflix

3 notes · View notes

mistery-conspiracy-secret-lies · 5 months ago

Text

https://rumble.com/v6g8fcy-contact-the-ce5-experience-2023-documentary.html

#science & technology documentay

0 notes

externasjuc7 · 19 days ago

Text

Processo seletivo 25/2

Está aberto o Edital para seleção de novos e novas sócio-moradores da JUC7 - 25/2. O período de inscrição e entrega da documentação será de 02/06/2025 até 28/06/2025. A partir do dia 26/07/2025, os inscritos que atenderem às exigências documentais serão informados sobre a segunda etapa, que consiste em uma entrevista de modo remoto via Meet com link enviado pelo e-mail. A entrevista individual estará agendada para 27/07/2025, a partir das 14h. Reiteramos que os candidatos serão informados sobre o processo via e-mail cadastrado no Formulário de Inscrição.

Para acessar a pasta de Documentos Importantes contendo: Edital, Estatuto, Regimento Interno Termo de Consentimento e Checklist. Clique Aqui

Para realizar sua inscrição pelo formulário virtual. Clique Aqui

3 notes · View notes

gentlealpha · 23 days ago

Text

aquele é OZAN SAYLAK, classificado como ALFA. ele tem TRINTA E DOIS ANOS e é natural de ISTAMBUL, TURQUIA, mas atualmente está residindo aqui perto em YONGDAM. atualmente, trabalha como NADADOR OLÍMPICO APOSENTADO. dizem que é muito LEAL e GENTIL. e também pode ser SOLITÁRIO e PERFECCIONISTA, acredita? mas quando passa, deixa para trás aquela essência de CEDRO-VERMELHO COM NOTAS DE CHÁ VERDE E VETIVER que é difícil ignorar.

Ozan ou Ozie é um alfa turco de 32 anos que trocou o brilho das piscinas olímpicas pela serenidade de Jeju. Se mudou para ilha faz pouco tempo, logo onde após se aposentar do esporte. Filho do meio de uma família influente de Istambul, sempre preferiu livros, piscina, animais e atividades ao ar livre do que às luzes da fama. Gentil, reservado e estudioso, vive uma rotina discreta em Yongdam, tem uma certa dificuldade com a língua, mas tem conseguido se virar. Formado em veterinária, não descarta a possibilidade de abrir uma clínica e estabelecer na ilha.

Hobbies: pilotar sua moto, caminhar, correr, fazer trilhas, malhar, ler, cultivar ervas aromáticas em vasos, ouvir músicas instrumentais ou folk turco.

Comidas favoritas: pratos simples e caseiros, como mercimek çorbası (sopa de lentilha), arroz com legumes, frutas frescas, Doenjang jjigae, Bibimbap, Sundubu jjigae e Bulgogi.

Bebidas favoritas: chá turco, ayran (bebida de iogurte com sal), infusões naturais e cerveja.

Audiovisual: séries documentais sobre vida selvagem, programas de culinária lenta, dramas históricos, documentários ambientais, animações, filmes de ação e super-heróis.

Filho do meio de uma renomada família de origem turca, Ozan cresceu cercado por brilho, exigência e expectativas. Sua mãe, uma conhecida colunista social, sempre esperou que os filhos herdassem não apenas sua influência, mas também sua ambição por fama e dinheiro. Ozan, no entanto, nunca se encaixou nesse molde. Desde pequeno, demonstrava mais interesse por bichos, aquários e livros de biologia do que fama e dinheiro.

Contido, gentil e um tanto distraído, ele sempre foi o mais calado dos irmãos, alguém que prefere escutar a disputar espaço. Não gosta de fofocas, não gosta de confrontos — mas, se for para proteger sua família, enfrentaria o mundo inteiro. Apesar dos embates com a mãe a ama profundamente. Sabe que, por trás da aparência teatral, há afeto genuíno.

Nadou por quase toda a vida, poderia muito bem ter nascido numa piscina de tanto que amava o esporte. Começou com 5 anos, entrou em competições aos 8 e, aos 18, já fazia parte da equipe olímpica turca. Aposentou-se cedo, aos 32, após uma carreira sólida, para seguir sua verdadeira paixão: cuidar de animais. Quando esteve na Coréia do Sul para uma competição, anos antes, encantou-se por Jeju durante uma visita com seus amigos, prometeu a si mesmo que voltaria para um recomeço mais calmo.

Agora que está aposentado, resolveu visitar a ilha novamente, não sabe se irá permanecer por muito tempo, mas não descarta a possibilidade de abrir uma clínica veterinária no local. Discreto, vive uma rotina tranquila, marcada por caminhadas entre trilhas e idas constantes à biblioteca, onde lê principalmente sobre anatomia animal, botânica e medicina veterinária. Apesar do ar reservado, é doce, espirituoso e surpreendentemente engraçado quando está entre amigos.

#🍂 𝑜 𝑧 𝑎 𝑛 — ❝ bio. ❞

6 notes · View notes

idollete · 7 months ago

Note

ao que tudo indica, o nicholas tirou foto em uma festa de halloween, com umas pessoas fantasiadas de lyle e erik menendez, e também tem boatos pelo tiktok onde algumas meninas disseram que ele já traiu a atual namorada.

(tb tem o fato dele ter dado um selinho na atriz que faz a mãe dele na série, mas isso acredito que não seja nada demais)

💀💀💀💀 e é por essa e mais algumas que eu sempre vou tender a ser contra séries sobre true crime que não sejam estritamente documentais. isso é LOOOOOUCO, quem é que se FANTASIA disso como pode. que terror!!!

#não me surpreendem tanto também #tinha que ser estadunidense fazendo merda #o que tem na água desse povo #papo de divas 𐙚

2 notes · View notes

govindhtech · 9 months ago

Text

Advanced Google Cloud LlamaIndex RAG Implementation

An sophisticated Google Cloud LlamaIndex RAG implementation Introduction. RAG is changing how it construct Large Language Model (LLM)-powered apps, but unlike tabular machine learning, where XGBoost is the best, there’s no “go-to” option. Developers need fast ways to test retrieval methods. This article shows how to quickly prototype and evaluate RAG solutions utilizing Llamaindex, Streamlit, RAGAS, and Google Cloud’s Gemini models. Beyond basic lessons, it’ll develop reusable components, expand frameworks, and consistently test performance.

LlamaIndex RAG

Building RAG apps with LlamaIndex is powerful. With LLMs, linking, arranging, and querying data is easier. The LlamaIndex RAG workflow breakdown:

Indexing and storage chunking, embedding, organizing, and structuring queryable documents.

How to obtain user-queried document parts. Nodes are LlamaIndex index-retrieved document chunks.

After analyzing a collection of relevant nodes, rerank them to make them more relevant.

Given a final collection of relevant nodes, curate a user response.

From keyword search to agentic methods, LlamaIndex provides several combinations and integrations to fulfill these stages.

Storing and indexing

The indexing and storing process is complicated. You must construct distinct indexes for diverse data sources, choose algorithms, parse, chunk, and embed, and extract information. Despite its complexity, indexing and storage include pre-processing a bunch of documents so a retrieval system may retrieve important sections and storing them.

The Document AI Layout Parser, available from Google Cloud, can process HTML, PDF, DOCX, and PPTX (in preview) and identify text blocks, paragraphs, tables, lists, titles, headings, and page headers and footers out of the box, making path selection easier. In order to retrieve context-aware information, Layout Parser maintains the document’s organizational structure via a thorough layout analysis.

It must generate LlamaIndex nodes from chunked documents. LlamaIndex nodes include metadata attributes to monitor parent document structure. LlamaIndex may express a lengthy text broken into parts as a doubly-linked list of nodes with PREV and NEXT relationships set to the node IDs.

Pre-processing LlamaIndex nodes before embedding for advanced retrieval methods like auto-merging retrieval is possible. The Hierarchical Node Parser groups nodes from a document into a hierarchy. Each level of the hierarchy reflects a bigger piece of a document, starting with 512-character leaf chunks and linking to 1024-character parent chunks. Only the leaf chunks are embedded in this hierarchy; the remainder are stored in a document store for ID queries. At retrieval time, the vector similarity just on leaf chunks and exploit the hierarchical relationship to get more context from bigger document parts. LlamaIndex Auto-merging Retriever applies this reasoning.

Embed the nodes and pick how and where to store them for later retrieval. Vector databases are clear, but it may need to store content in another fashion to enable hybrid search with semantic retrieval. It demonstrate how to establish a hybrid store in Google Cloud’s Vertex AI Vector Store and Firestore to store document chunks as embedded vectors and key-value stores. It may use this to query documents by vector similarity or id/metadata match.

Multiple indices should be created to compare approach combinations. As an alternative to the hierarchical index, it may design a flat index of fixed-sized pieces.

Retrieval

Retrieval brings a limited number of relevant documents from its vector store/docstore combo to an LLM for context-based response. The LlamaIndex Retriever module abstracts this work well. Subclasses of this module implement the _retrieve function, which accepts a query and returns a list of NodesWithScore, or document chunks with scored relevance to the inquiry. Retrievers in LlamaIndex are popular. Always attempt a baseline retriever that uses vector similarity search to get the top k NodesWithScore.

Automatic retrieval

Baseline_retriever does not include the hierarchical index structure was established before. A document store’s hierarchy of chunks enables an auto-merging retriever to recover nodes based on vector similarity and the source document. It may obtain extra material that may encompass the original node pieces. The baseline_retriever may retrieve five node chunks based on vector similarity.

If its question is complicated, such chunks (512 characters) may not have enough information to answer it. Three of the five chunks may be from the same page and reference distinct paragraphs within a section. The auto-merging retriever may “walk” the hierarchy, getting bigger chunks and providing a larger piece of the document for the LLM to build a response since they recorded their hierarchy, relation to larger chunks, and togetherness. This balances shorter chunk sizes’ retrieval precision with the LLM’s need for relevant data.

LlamaIndex Search

With a collection of NodesWithScores, it must determine their ideal arrangement. Formatting or deleting PII may be necessary. It must then give these pieces to an LLM to get the user’s intended response. The LlamaIndex QueryEngine manages retrieval, node post-processing, and answer synthesis. Passing a retriever, node-post-processing method (if applicable), and response synthesizer as inputs creates a QueryEngine. QueryEngine’s query and aquery (asynchronous query) methods accept a string query and return a Response object with the LLM-generated response and a list of NodeWithScores.

Imagined document embedding

Enveloping the user’s query and calculating vector similarity with the vector storage is how most Llama-index retrievers work. Due to the question’s and answer’s different language structures, this may be unsatisfactory. Hypothetical document embedding (HyDE) uses LLM hallucination to address this. Hallucinate a response to the user’s inquiry without context, then embed it in the vector storage for vector similarity search.

Reranking LLM nodes

A Node Post-Processor in Llamaindex implements _postprocess_nodes, which takes the query and list of NodesWithScores as input and produces a new list. Googles may need to rerank the nodes from the retriever by LLM relevancy to improve their ranking. There are explicit models for re-ranking pieces for a query, or it may use a general LLM.

Reply synthesis

Many techniques exist to direct an LLM to respond to a list of NodeWithScores. Google Cloud may summarize huge nodes before requesting the LLM for a final answer. It may wish to offer the LLM another opportunity to improve or amend an initial answer. The LlamaIndex Response Synthesizer helps us decide how the LLM will respond to a list of nodes.

REACT agent

Google Cloud add a reasoning loop to its query pipeline using ReAct (Yao, et al. 2022). This lets an LLM use chain-of-thought reasoning to answer complicated questions that need several retrieval processes. Its query_engine is exposed to the ReAct agent as a tool for thinking and acting in Llamaindex to design a ReAct loop. Multiple tools may be added here to let the ReAct agent chose or condense results.

Final QueryEngine Creation

After choosing many ways from the stages above, you must write logic to construct your QueryEngine depending on an input configuration. Function examples are here.

Methods for evaluation

After creating a QueryEngine object, it can easily send queries and get RAG pipeline replies and context. Next, it may create the QueryEngine object as part of a backend service like FastAPI and a small front-end to play with it (conversation vs. batch).

When conversing with the RAG pipeline, the query, obtained context, and response may be utilized to analyze the response. It can compute evaluation metrics and objectively compare replies using these three areas. Based on this triad, RAGAS gives heuristic measures for response fidelity, answer relevancy, and context relevancy. With each chat exchange, the calculate and present these.

Expert annotation should also be used to find ground-truth responses. RAG pipeline performance may be better assessed using ground truth. It may determine LLM-graded accuracy by asking an LLM whether the response matches the ground truth or other RAGAS measures like context precision and recall.

Deployment

The FastAPI backend will provide /query_rag and /eval_batch. queries/rag/ is used for one-time interactions with the query engine that can evaluate the response on the fly. Users may choose an eval_set from a Cloud Storage bucket and conduct batch evaluation using query engine parameters with /eval_batch.

In addition to establishing sliders and input forms to match its specifications, Streamlit’s Chat components make it simple to whip up a UI and communicate with the QueryEngine object via a FastAPI backend.

Conclusion

Building a sophisticated RAG application on GCP using modular technologies like LlamaIndex, RAGAS, FastAPI, and streamlit gives you maximum flexibility as you experiment with different approaches and RAG pipeline tweaks. Maybe you’ll discover the “XGBoost” equivalent for your RAG issue in a miraculous mix of settings, prompts, and algorithms.