Indexify

Click to visit website
About
Indexify is an open-source data framework designed for effortless ingestion and extraction of unstructured data at any scale for LLMs. It features a real-time extraction engine, pre-built extractors for various data types (documents, presentations, videos, audio), and supports custom extractor creation. Data retrieval is facilitated by semantic search and SQL querying. Indexify scales from local runtimes to large-scale Kubernetes deployments across multiple clouds. It also provides end-to-end observability and monitoring of ingestion, extraction, and retrieval processes.
Platform
Task
Features
• semantic search
• multi-modal support
• runs on laptops and across large-scale deployments (kubernetes, vms, bare metal)
• sql querying
• custom extractor creation using sdk
• reliable extraction for unstructured data (documents, presentations, videos, audio)
• pre-built extraction adapters
• real-time extraction engine
Job Opportunities
Founding Applied AI Scientist
Indexify is an open-source, real-time data extraction framework for LLMs, supporting various data types and scalable deployments.
Benefits:
401(k) plans
Comprehensive Healthcare and Dental Benefits
Education Requirements:
Ph.D. or Bachelor's degree in a quantitative field such as Computer Science, Mathematics, or equivalent industry experience
Experience Requirements:
4+ years of experience working with AI/ML models, specifically in the fields of document understanding, computer vision, and multi-modal learning
Proven expertise in training and evaluating models for complex document extraction
Deep NLP Expertise
OCR Integration
Model Pretraining and Fine-tuning
Other Requirements:
Solid programming skills in Python and proficiency in at least one deep learning framework (e.g., TensorFlow, PyTorch)
Layout Analysis
Benchmarking and Evaluation
Vision-Language Models
Responsibilities:
Design, train, and evaluate document understanding models for extracting complex data
Develop and optimize multi-modal visual Q&A models
Collaborate with the team to integrate AI-driven features into Tensorlake’s platform
Work closely with users and customers to understand their needs
Show more details
Founding Backend Engineer
Indexify is an open-source, real-time data extraction framework for LLMs, supporting various data types and scalable deployments.
Benefits:
401(k) plans
Comprehensive Healthcare and Dental Benefits
Education Requirements:
Ph.D. or Bachelor's degree in Math, Computer Science, or other quantitative fields, OR equivalent experience
Experience Requirements:
7+ years of relevant work experience
Experience in building large-scale distributed systems
Other Requirements:
Knowledge of systems programming languages such as Rust, Go, C++, or C
Designing observable systems that operate at internet scale
Deep knowledge of operating and using cluster schedulers
Responsibilities:
Design and implement a distributed control plane for operating Indexify on public clouds
Design and implement workflows for cluster operations and bootstrapping in VPCs
Focus on long term operability of the system and services
Work closely with the Founder on the company's technical direction and platform
Work closely with our users to learn the impact of our product and improve their experience
Show more details
Founding Product Engineer
Indexify is an open-source, real-time data extraction framework for LLMs, supporting various data types and scalable deployments.
Benefits:
Healthcare, Dental and Vision Insurance
401(k) plans
5 weeks of PTO
Experience Requirements:
At least 7 years of front-end or full-stack development
Familiarity with technologies such as Python, React, Typescript, FastAPI, or SQLAlchemy
Other Requirements:
Motivated people who are excited to build tools to power the next generation of cloud applications
Passionate about working adjacent to users and the product
Responsibilities:
Develop delightful UIs or high quality backend business logic that empower software developers and simplify programming
Work with a team of leading distributed systems and machine learning experts
Communicate your work to a broader audience through talks, tutorials, and blog posts
Help us to build and shape a world class company
Show more details
Ratings & Reviews
No ratings available yet. Be the first to rate this tool!
Alternatives
LiftData
LiftData provides real-time AI-powered data extraction from various content sources using a decentralized, scalable platform.
View DetailsGilio
Gilio processes documents with AI, extracting and transforming information for automation. It integrates with various systems via API and offers features such as data validation, document digitization, and workflow automation.
View DetailsPDFMerse
PDFMerse is an AI-powered tool that transforms PDFs into structured data, offering automated extraction, enhanced accuracy, and versatile output formats.
View DetailsMap Lead Scraper
Map Lead Scraper is a Google Maps scraping tool that extracts local business data and contacts, saving hours of manual searches for lead generation.
View DetailsInstantAPI.ai
InstantAPI.ai is an AI web scraping API that extracts clean data from any webpage without requiring selectors, CAPTCHA handling, or extensive maintenance.
View DetailsFormToExcel
FormToExcel is an AI-powered tool that converts various forms, tables, receipts, and invoices into Excel spreadsheets with high accuracy, simplifying data entry.
View DetailsGLIB.ai
GLIB.ai uses AI to automate document processing, extract data, and provide real-time insights for banking, insurance, and supply chain finance.
View DetailsSkrapy
Skrapy is an AI tool for agent-driven data extraction, helping you unlock the web and collect relevant data efficiently.
View DetailsCardamons Digitize
AI-powered intelligent document processing (IDP) platform for automated data extraction with 98% accuracy.
View DetailsJSON Scout
JSON Scout is an API tool leveraging LLMs to convert unstructured content (text, audio) into structured JSON data with human-like precision, eliminating the need for complex REGEX patterns.
View DetailsEvolution AI
Evolution AI provides AI data extraction from financial documents, including bank statements and invoices, offering both self-service and managed service options.
View DetailsInfrrd IDP
Infrrd IDP: AI-powered Intelligent Document Processing for automated data extraction, helping businesses scale with accuracy and efficiency.
View DetailsSubsystem AI
Subsystem AI helps you extract and organize data from documents using AI, offering features like automated table extraction and API integration. Coming soon: bulk processing and document comparison.
View DetailsRossum
Rossum is an AI-powered platform for automating transactional document processing, from data capture to approvals, using AI to extract data, validate information, and trigger automated workflows.
View DetailsPalladian
Palladian is an open-source Java library offering robust machine learning, web retrieval, and entity extraction capabilities for developers and researchers.
View DetailsDocugami
Docugami transforms business documents with AI, improving productivity, compliance, and insight. It delivers immediate impact by connecting to familiar tools without rigid templates.
View DetailsDataku
Advanced data extraction and analysis tool utilizing AI for accurate insights from documents and texts.
View DetailsKadoa
Kadoa is an AI-powered platform that automatically extracts unstructured web data at scale without code, providing instant insights and eliminating engineering bottlenecks.
View DetailsIntics
Intics is an all-inclusive solution empowering you to master document processing and data management with intelligent data recognition, extraction, and validation.
View DetailsFeatured Tools
AI Dubbing
AI Dubbing is a free AI video dubbing tool that uses advanced AI technology to provide natural, smooth, high-quality dubbing services, supporting 20+ languages and 100+ tones.
View DetailsAI Image Editor
AI Image Editor is a free online tool to edit, transform, and enhance photos with a text prompt, achieving fast, consistent, high-quality results.
View DetailsSora2 AI Video Generator
Sora2 AI Video Generator is an advanced tool powered by OpenAI's Sora2 technology, creating cinema-quality 1080p videos from text and images with realistic physics and perfect character consistency.
View DetailsAnimate Image AI
Animate Image AI is a platform that allows you to create captivating animations from your photos. It uses advanced AI technology to bring your photos to life.
View DetailsImage To Image
Image To Image is a cutting-edge AI photo generator transforming images with high quality and precise prompt control, offering instant creative evolution.
View DetailsAI Make Song
AI Make Song is your ultimate AI song generator and music maker, designed to help anyone create professional-quality AI music free in minutes.
View DetailsCrePal
CrePal is the world's first AI Video Creation Agent, transforming ideas into stunning videos with cutting-edge AI models for planning, imaging, and video generation.
View DetailsYolly AI
Yolly AI is an all-in-one AI video & photo generator that lets you turn a single text prompt into cinema-grade 4K videos or high-resolution images.
View DetailsSeedance 1.5
Seedance 1.5 is a next-generation AI video creation tool transforming ideas into stunning 1080p videos with multi-shot narratives, physics-accurate motion, and cinematic quality.
View DetailsUnblur Image Online Free
Unblur Image Online Free instantly restores sharpness to blurry photos using AI. Upload JPG, PNG, or WEBP files for clear images in seconds, completely free and no sign-up needed.
View Detailsadly.news
adly.news is a free platform that simplifies newsletter advertising, connecting businesses with engaged audiences through ad slots, offering bidding, negotiation, and messaging.
View DetailsMiss Pepper AI
Miss Pepper AI is an AI-powered platform for smarter marketing, offering SEO, marketing automation, and identity resolution to drive measurable results and uncover customer insights.
View Details