#DocStore
Explore tagged Tumblr posts
exmcloud · 5 months ago
Photo
Tumblr media
The future of document management is here. Discover the trends and how DocStore keeps you at the forefront! 2025 Document Management Trends: 𝗔𝗜 & 𝗔𝘂𝘁𝗼𝗺𝗮𝘁𝗶𝗼𝗻 𝗧𝗮𝗸𝗲 𝗖𝗲𝗻𝘁𝗲𝗿 𝗦𝘁𝗮𝗴𝗲: Automate tasks, organize smarter, and streamline workflows. DocStore brings AI to your document processes. 𝗦𝗲𝗰𝘂𝗿𝗶𝘁𝘆 𝗶𝘀 𝗡𝗼𝗻-𝗡𝗲𝗴𝗼𝘁𝗶𝗮𝗯𝗹𝗲: Protect data with advanced permissions and encryption. Stay secure and compliant with DocStore. 𝗜𝗻𝘁𝗲𝗴𝗿𝗮𝘁𝗶𝗼𝗻 𝗶𝘀 𝗞𝗲𝘆: Connect effortlessly with SharePoint, Google Drive, Azure, and more. DocStore keeps everything in sync within Salesforce.
2 notes · View notes
govindhtech · 7 months ago
Text
Advanced Google Cloud LlamaIndex RAG Implementation
Tumblr media
An sophisticated Google Cloud LlamaIndex RAG implementation Introduction. RAG is changing how it construct Large Language Model (LLM)-powered apps, but unlike tabular machine learning, where XGBoost is the best, there’s no “go-to” option. Developers need fast ways to test retrieval methods. This article shows how to quickly prototype and evaluate RAG solutions utilizing Llamaindex, Streamlit, RAGAS, and Google Cloud’s Gemini models. Beyond basic lessons, it’ll develop reusable components, expand frameworks, and consistently test performance.
LlamaIndex RAG
Building RAG apps with LlamaIndex is powerful. With LLMs, linking, arranging, and querying data is easier. The LlamaIndex RAG workflow breakdown:
Indexing and storage chunking, embedding, organizing, and structuring queryable documents.
How to obtain user-queried document parts. Nodes are LlamaIndex index-retrieved document chunks.
After analyzing a collection of relevant nodes, rerank them to make them more relevant.
Given a final collection of relevant nodes, curate a user response.
From keyword search to agentic methods, LlamaIndex provides several combinations and integrations to fulfill these stages.
Storing and indexing
The indexing and storing process is complicated. You must construct distinct indexes for diverse data sources, choose algorithms, parse, chunk, and embed, and extract information. Despite its complexity, indexing and storage include pre-processing a bunch of documents so a retrieval system may retrieve important sections and storing them.
The Document AI Layout Parser, available from Google Cloud, can process HTML, PDF, DOCX, and PPTX (in preview) and identify text blocks, paragraphs, tables, lists, titles, headings, and page headers and footers out of the box, making path selection easier. In order to retrieve context-aware information, Layout Parser maintains the document’s organizational structure via a thorough layout analysis.
It must generate LlamaIndex nodes from chunked documents. LlamaIndex nodes include metadata attributes to monitor parent document structure. LlamaIndex may express a lengthy text broken into parts as a doubly-linked list of nodes with PREV and NEXT relationships set to the node IDs.
Pre-processing LlamaIndex nodes before embedding for advanced retrieval methods like auto-merging retrieval is possible. The Hierarchical Node Parser groups nodes from a document into a hierarchy. Each level of the hierarchy reflects a bigger piece of a document, starting with 512-character leaf chunks and linking to 1024-character parent chunks. Only the leaf chunks are embedded in this hierarchy; the remainder are stored in a document store for ID queries. At retrieval time, the vector similarity just on leaf chunks and exploit the hierarchical relationship to get more context from bigger document parts. LlamaIndex Auto-merging Retriever applies this reasoning.
Embed the nodes and pick how and where to store them for later retrieval. Vector databases are clear, but it may need to store content in another fashion to enable hybrid search with semantic retrieval. It demonstrate how to establish a hybrid store in Google Cloud’s Vertex AI Vector Store and Firestore to store document chunks as embedded vectors and key-value stores. It may use this to query documents by vector similarity or id/metadata match.
Multiple indices should be created to compare approach combinations. As an alternative to the hierarchical index, it may design a flat index of fixed-sized pieces.
Retrieval
Retrieval brings a limited number of relevant documents from its vector store/docstore combo to an LLM for context-based response. The LlamaIndex Retriever module abstracts this work well. Subclasses of this module implement the _retrieve function, which accepts a query and returns a list of NodesWithScore, or document chunks with scored relevance to the inquiry. Retrievers in LlamaIndex are popular. Always attempt a baseline retriever that uses vector similarity search to get the top k NodesWithScore.
Automatic retrieval
Baseline_retriever does not include the hierarchical index structure was established before. A document store’s hierarchy of chunks enables an auto-merging retriever to recover nodes based on vector similarity and the source document. It may obtain extra material that may encompass the original node pieces. The baseline_retriever may retrieve five node chunks based on vector similarity.
If its question is complicated, such chunks (512 characters) may not have enough information to answer it. Three of the five chunks may be from the same page and reference distinct paragraphs within a section. The auto-merging retriever may “walk” the hierarchy, getting bigger chunks and providing a larger piece of the document for the LLM to build a response since they recorded their hierarchy, relation to larger chunks, and togetherness. This balances shorter chunk sizes’ retrieval precision with the LLM’s need for relevant data.
LlamaIndex Search
With a collection of NodesWithScores, it must determine their ideal arrangement. Formatting or deleting PII may be necessary. It must then give these pieces to an LLM to get the user’s intended response. The LlamaIndex QueryEngine manages retrieval, node post-processing, and answer synthesis. Passing a retriever, node-post-processing method (if applicable), and response synthesizer as inputs creates a QueryEngine. QueryEngine’s query and aquery (asynchronous query) methods accept a string query and return a Response object with the LLM-generated response and a list of NodeWithScores.
Imagined document embedding
Enveloping the user’s query and calculating vector similarity with the vector storage is how most Llama-index retrievers work. Due to the question’s and answer’s different language structures, this may be unsatisfactory. Hypothetical document embedding (HyDE) uses LLM hallucination to address this. Hallucinate a response to the user’s inquiry without context, then embed it in the vector storage for vector similarity search.
Reranking LLM nodes
A Node Post-Processor in Llamaindex implements _postprocess_nodes, which takes the query and list of NodesWithScores as input and produces a new list. Googles may need to rerank the nodes from the retriever by LLM relevancy to improve their ranking. There are explicit models for re-ranking pieces for a query, or it may use a general LLM.
Reply synthesis
Many techniques exist to direct an LLM to respond to a list of NodeWithScores. Google Cloud may summarize huge nodes before requesting the LLM for a final answer. It may wish to offer the LLM another opportunity to improve or amend an initial answer. The LlamaIndex Response Synthesizer helps us decide how the LLM will respond to a list of nodes.
REACT agent
Google Cloud add a reasoning loop to its query pipeline using ReAct (Yao, et al. 2022). This lets an LLM use chain-of-thought reasoning to answer complicated questions that need several retrieval processes. Its query_engine is exposed to the ReAct agent as a tool for thinking and acting in Llamaindex to design a ReAct loop. Multiple tools may be added here to let the ReAct agent chose or condense results.
Final QueryEngine Creation
After choosing many ways from the stages above, you must write logic to construct your QueryEngine depending on an input configuration. Function examples are here.
Methods for evaluation
After creating a QueryEngine object, it can easily send queries and get RAG pipeline replies and context. Next, it may create the QueryEngine object as part of a backend service like FastAPI and a small front-end to play with it (conversation vs. batch).
When conversing with the RAG pipeline, the query, obtained context, and response may be utilized to analyze the response. It can compute evaluation metrics and objectively compare replies using these three areas. Based on this triad, RAGAS gives heuristic measures for response fidelity, answer relevancy, and context relevancy. With each chat exchange, the calculate and present these.
Expert annotation should also be used to find ground-truth responses. RAG pipeline performance may be better assessed using ground truth. It may determine LLM-graded accuracy by asking an LLM whether the response matches the ground truth or other RAGAS measures like context precision and recall.
Deployment
The FastAPI backend will provide /query_rag and /eval_batch. queries/rag/ is used for one-time interactions with the query engine that can evaluate the response on the fly. Users may choose an eval_set from a Cloud Storage bucket and conduct batch evaluation using query engine parameters with /eval_batch.
In addition to establishing sliders and input forms to match its specifications, Streamlit’s Chat components make it simple to whip up a UI and communicate with the QueryEngine object via a FastAPI backend.
Conclusion
Building a sophisticated RAG application on GCP using modular technologies like LlamaIndex, RAGAS, FastAPI, and streamlit gives you maximum flexibility as you experiment with different approaches and RAG pipeline tweaks. Maybe you’ll discover the “XGBoost” equivalent for your RAG issue in a miraculous mix of settings, prompts, and algorithms.
Read more on govindhtech.com
0 notes
jamesjbkim · 9 months ago
Text
Zoroastrianism and Christianity share some thematic similarities, but they are distinct religions with different core beliefs and origins.
### Similarities:
1. **Messianic Figures**: Both religions feature a messianic figure—Saoshyant in Zoroastrianism and Jesus in Christianity—who is expected to bring about the end times and final judgment[2][4].
2. **Dualism**: Zoroastrianism has a strong dualistic element, with Ahura Mazda representing good and Angra Mainyu (Ahriman) representing evil. Christianity also incorporates a form of dualism with God and Satan[2][5].
3. **Resurrection and Final Judgment**: Both religions include beliefs in resurrection and a final judgment[3][5].
### Differences:
1. **Monotheism vs. Dualism**: Christianity is strictly monotheistic, while Zoroastrianism is dualistic, positing two opposing forces of good and evil[1][2].
2. **Concept of God**: In Christianity, God is the source of both good and evil, whereas in Zoroastrianism, Ahura Mazda represents only good, with evil being a separate entity[2].
3. **Scriptural Origins**: The oldest Zoroastrian texts, the Avesta, date from the 13th century AD, while the Jewish Scriptures predate Christianity by centuries, making direct borrowing unlikely[3].
While there are parallels, the significant theological and doctrinal differences indicate that Christianity is not a copy of Zoroastrianism.
Sources
[1] Is Christianity just a copy of Zoroastrianism? - Reddit https://www.reddit.com/r/Christianity/comments/mhywmy/is_christianity_just_a_copy_of_zoroastrianism/
[2] [PDF] ZOROASTRIANISM, JUDAISM, AND CHRISTIANITY https://olli.gmu.edu/docstore/600docs/1403-651-3-Zoroastrianism,%20Judaism,%20and%20Christianity.pdf
[3] Are the ideas of Jesus and Christianity borrowed from Mithra and ... https://www.gotquestions.org/Jesus-Mithra-Christianity-Zoroastrianism.html
[4] Zoroastrianism and the Resemblances between It and Christianity https://www.jstor.org/stable/3140852
[5] World Religions and Cults: Zoroastrianism | Answers in Genesis https://answersingenesis.org/world-religions/world-religions-and-cults-zoroastrianism/
0 notes
openbooth · 1 year ago
Text
Uber's CacheFront: Powering 40M Reads per Second with Significantly Reduced Latency Uber developed an innovative caching solution, CacheFront, for its in-house distributed database, Docstore. CacheFront enables over 40M reads per second from online storage and achieves substantial performance improvements, including a 75% reduction in P75 latency and over 67% reduction in P99.
— https://ift.tt/2msZTuf
0 notes
jayysnotjoyful · 3 months ago
Text
docstors are ignrofign em :(
i thaink they athink im crazy
or mayBse i am uudt sleepy idka
i will talsk to you tomorowow mayebs
mayebsz not
iam reals,ly tir2ds :(
i survivvedddddddd
cant post pictures rn because im in the docots sit,l and look like TRASH.
byt im alive!!!! do tos said i get to go home in a few dayss :3
19 notes · View notes
creatitydevelop · 4 years ago
Text
Enjoy optimal performance with Mendix application development
The fast processing age is asking for more rather than traditional techniques to sustain your business. Mendix application development carries 6 times higher productivity in comparison to other old style software. Your customized planning and way of conducting your plans has to be acute to the needs of your customers. The solutions are face lifted with responsive design. Hence, it makes it functional to all your devices whether it is mobile, desktop or tablet.
1 note · View note
m4rvelstuffs · 6 years ago
Photo
Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media Tumblr media
just like if you use or save
181 notes · View notes
abualasad · 5 years ago
Link
1 note · View note
voluminoussoul · 4 years ago
Text
Cited works
All research heavily influenced by:
Cooper, Charlotte, 2008. What is Fat Activism?. [ebook] Limerick: University of Limerick, pp.1-25. Available at: <http://www.ul.ie/sociology/docstore/workingpapers/wp2008-02.pdf> Accessed 27 April 2021.
Leboeuf, Céline. “What Is Body Positivity? The Path from Shame to Pride.” Philosophical Topics, vol. 47, no. 2, 2019, pp. 113–128. JSTOR, www.jstor.org/stable/26948109. Accessed 23 April 2021.
Spieldenner, Andrew R. “Considering the Queer Disabled/Debilitated Body: An Introduction of Queer Cripping.” QED: A Journal in GLBTQ Worldmaking, vol. 6, no. 3, 2019, pp. 76–80. JSTOR, www.jstor.org/stable/10.14321/qed.6.3.0076. Accessed 20 April 2021.
As well as just generally influenced by: 
Nguyen, Tram. “From SlutWalks to SuicideGirls: Feminist Resistance in the Third Wave and Postfeminist Era.” Women's Studies Quarterly, vol. 41, no. 3/4, 2013, pp. 157–172. JSTOR, www.jstor.org/stable/23611512. Accessed 10 May 2021.
Rhode, Deborah L. “The Injustice of Appearance.” Stanford Law Review, vol. 61, no. 5, 2009, pp. 1033–1101. JSTOR, www.jstor.org/stable/40379704. Accessed 10 May 2021.
0 notes
exmcloud · 5 months ago
Photo
Tumblr media
One platform, endless possibilities. Generate, customize, and manage your documents seamlessly with DocStore!
2 notes · View notes
infotrellis-blog · 7 years ago
Text
Thrive on IBM MDM CE
IBM InfoSphere Master Data Management Collaborative Edition provides a highly scalable, enterprise Product Information Management (PIM) solution that creates a golden copy of products and becomes trusted system of record for all product related information.
Performance is critical for any successful MDM solution which involves complex design and architecture. Performance issues become impedance for smooth functioning of an application, thus obstructing the business to get the best out of it. Periodic profiling and optimizing the application based on the findings is vital for a seamless application.
InfoTrellis has been providing services in PIM space over a decade now for our esteemed clientele that is spread across the globe.
This blog post details on optimizing IBM InfoSphere MDM Collaborative Edition application based on the tacit knowledge acquired from implementations and upgrades carried out over the years.
Performance is paramount Performance is one of the imperative factors that make an application more reliable. Application performance of MDM Collaborative Edition is influenced by various factors such as solution design, implementation, infrastructure, data volume, DB configurations, WebSphere setup, application version, and so on. These factors play a huge role in affecting business either positively or otherwise. Besides, even in a carefully designed and implemented MDM CE solution, performance issues creep up over a period of time owing to miscellaneous reasons.
Performance Diagnosis The following questions might help you to narrow down a performance problem to a specific component.
What exactly is slow – Only a specific component or general slowness which affects all UI interactions or scheduled jobs?
When did the problem manifest?
Did performance degrade over time or was there an abrupt change in performance after a certain event?
Answers to the above queries may not be the panacea but provide a good starting point to improve the performance.
Hardware Sizing and Tuning Infrastructure for the MDM CE application is the foundation on top of which lays the superstructure.
IBM recommends a hardware configuration for a standard MDM CE Production server. But then, that is just a pointer towards the right direction and MDM CE Infrastructure Architects should take it with a pinch of salt.
Some of the common areas which could be investigated to tackle performance bottlenecks are:
Ensuring availability of physical memory (RAM) so no or little memory swapping and paging occurs.
Latency and bandwidth between the application server boxes and database server. This gains prominence if the Data Centers hosting these are far and away. Hosting Primary DB and App Servers on Data Center could help here.
Running MDM CE on dedicated set of boxes will greatly help so that all the hardware resources are up for grabs and isolating performance issues becomes a fairly simple process, of course, relatively.
Keeping an eye on disk reads, writes and queues. Any of these rising beyond dangerous levels is not a good sign.
Clustering and Load Balancing Clustering and Load balancing are two prevalent techniques used by applications to provide “High Availability and Scalability”.
Horizontal Clustering – Add more firepower to MDM CE Application by adding more Application Servers
Vertical Clustering – Add more MDM CE Services per App Server box by taking advantage of MDM CE configuration – like more Scheduler and AppServer services as necessary
Adding a Load Balancer, a software or hardware IP sprayer or IBM HTTP Server will greatly improve the Business user’s experiences with the MDM CE GUI application
Go for High Performance Network File System Typically clients go with NFS filesystem for MDM CE clustered environments as it is a freeware. For a highly concurrent MDM CE environment, opt for a commercial-grade, tailor-made high performance network file system like IBM Spectrum Scale .
Database Optimization The performance and reliability of MDM CE is highly dependent on a well-managed database. Databases are highly configurable and can be monitored to optimize performance by proactively resolving performance bottlenecks.
The following are the few ways to tweak database performance.
Optimize database lock wait, buffer pool sizes, table space mappings and memory parameters to meet the system performance requirements
Go with recommended configuration of a Production-class DB server for MDM CE Application
Keeping DB Server and Client and latest yet compatible versions to take advantage of bug fixes and optimizations
Ensuring database statistics are up to date. Database statistics can be collected manually by running the shell script from MDM CE located in $TOP/src/db/schema/util/analyze_schema.sh
Check memory allocation to make sure that there are no unnecessary disk reads.
Defragmenting on need-basis
Checking long running queries and optimizing query execution plans, indexing potential columns
Executing $TOP/bin/indexRegenerator.sh whenever the indexed attributes in MDM CE data model is modified
MDM CE Application Optimization
The Performance in MDM CE application can be overhauled at various components like data model, Server config., etc. We have covered the best practices that have to be followed in the application side.
Data Model and Initial Load
Carefully choose the number of Specs. Discard the attributes that will not be mastered, governed in MDM CE
Similarly, larger number of views, Attribute Collections, Items and attributes slower the user interface screen performance. Tabbed views should come handy here to tackle this.
Try to off-load cleansing and standardization activities outside of MDM solution
Workflow with a many steps can result in multiple problems ranging from an unmanageable user interface to very slow operations that manage and maintain the workflow, so it should be carefully designed.
MDM CE Services configuration
MDM CE application comprises of the following services which are highly configurable to provide optimal performance – Admin, App Server, Event Processor, Queue Manager, Workflow Engine and Scheduler.
All the above services can be fine-tuned through the following configuration files found within the application.
$TOP/bin/conf/ini – Allocate sufficient memory to the MDM CE Services here
$TOP/etc/default/common.properties – Configure connection pool size and polling interval for individual services here
Docstore Maintenance
Document Store is a placeholder for unstructured data in MDM CE – like logs, feed files and reports. Over a period of time the usage of Document Store grows exponentially, so are the obsolete files. The document store maintenance report shall be used to check document store size and purge documents that do not hold significance anymore.
Use the IBM® MDMPIM DocStore Volume Report and IBM MDMPIM DocStore Maintenance Report jobs to analyze the volume of DocStore and to clean up the documents beyond configured data retention period in IBM_MDMPIM_DocStore_Maintenance_Lookup lookup table.
Configure IBM_MDMPIM_DocStore_Maintenance_Lookup lookup table to configure data retention period for individual directories and action to be performed once that is elapsed – like Archive or Purge
Cleaning up Old Versions
MDM CE does versioning in two ways.
Implicit versioning
This occurs when the current version of an object is modified during the export or import process.
Explicit versioning
This kind of versioning occurs when you manually request a backup.
Older versions of items, performance profiles and job history need to be cleansed periodically to save load on DB server and application performance in turn.
Use the IBM MDMPIM Delete Old Versions Report and IBM MDMPIM Estimate Old Versions Report in scheduled fashion to estimate and clear out old entries respectively Configure IBM MDMPIM Data Maintenance Lookup lookup table to hold appropriate data retention period for Old Versions, Performance Profiles and Job History
Best Practices in Application Development
MDM CE presents couple of programming paradigms for application developers who are customizing the OOTB solution.
Scripting API – Proprietary scripting language which at runtime converts the scripts into java classes and run them in JVM. Follow the best practices documented here for better performance
Java API – Always prefer Java API over the Scripting API to yield better performance. Again, ensure the best practices documented here are diligently followed
If Java API is used for the MDM CE application development, or customization, then :
Use code analyzing tools like PMD, Findbung, SonarQube to have periodic checkpoints so that only the optimized code is shipped at all times
Use profiling tools like JProfiler, XRebel, YourKit or VisualVM to constantly monitor thread pools use, memory pools statistics, details about the frequency of garbage collection and so on. Using these tools during resource-intensive activities in MDM CE, like running heavyweight import or export jobs, will not just shed light on inner workings of JVM but offers cues on candidates for optimization
Cache Management
Keeping frequent accessed objects in cache is a primary technique to improvement performance. Cache hit percentage need to be really high for smooth functioning of the application.
Check the Cache hit percentage for various objects in the GUI menu System Administrator->Performance Info->Caches The $TOP/etc/default/mdm-ehcache-config.xml and $TOP/etc/default/mdm-cache-config.properties files can be configured to hold large number of entries in cache for better performance
Performance Profiling A successful performance testing will project most of the performance issues, which could be related to database, network, software, hardware etc. Establish a baseline, identify targets, and analyze use cases to make sure that the performance of the application holds good for long.
You should identify areas of solution that generally extends beyond normal range and few examples are large number of items, lots of searchable attributes, large number of lookup tables.
Frameworks such as JUnit, JMeter shall be used in a MDM CE engagement where Java API is the programming language of choice
About the author
Sruthi is a MDM Consultant at InfoTrellis and worked in multiple IBM MDM CE engagements. She has over 2 years of experience in technologies such as IBM Master Data Management Collaborative Edition and BPM.
Selvagowtham is a MDM Consultant at InfoTrellis and plying his trade in Master Data Management for over 2 years. He is a proficient consultant in IBM Master Data Management Collaborative Edition and Advanced Edition product.
1 note · View note
Link
Zeta ERP DocStore is an efficient, easily accessible document archiving system. Instant retrieval of any document is possible in just a few clicks
0 notes
exmcloud · 3 months ago
Photo
Tumblr media
Stay tuned for more quick tips to streamline your workflow and boost productivity with DocStore!
0 notes
exmcloud · 3 months ago
Photo
Tumblr media
Fun Fact Friday! Streamline your workflow, save on costs, and keep your documents secure—all with DocStore. fact source: https://www.business.com/articles/7-statistics-that-will-make-you-rethink-your-document-management-strategy/
1 note · View note
exmcloud · 3 months ago
Video
tumblr
DocStore isn't just a name—it’s your all-in-one solution for seamless document management. Simplify, secure, and integrate your documents with ease.
1 note · View note
exmcloud · 3 months ago
Photo
Tumblr media
Switch to DocStore for effortless, secure, and centralized document management—right from Salesforce.
1 note · View note