Defined.ai

Click to visit website
About
Defined.ai is a company that provides high-quality, ethically sourced AI training data. They offer a large marketplace with diverse datasets for various applications, including spontaneous speech, scripted monologues, interactive voice response (IVR), and more. They also provide custom data services, quality control, and support. The company is focused on ethical AI development and maintains transparency in their data collection and handling processes.
Platform
Task
Features
• transcription
• expert support
• data collection
• data annotation
• ethical data sourcing
• high-quality data
• custom data services
• large selection of datasets
FAQs
How and from where were the participants in these datasets recruited?
Contributors are recruited using various methods, including organic and paid acquisition strategies, across self-owned channels, third-party platforms, and partnerships. Targeting is based on demographics, skills, experience, language, device, interests, and real-time context.
How do we inform the dataset participants about how the data collected will be used?
Contributors consent to our Terms of Use, Privacy Policy, and Cookies Policy before using the platform. The Privacy Policy details information collection and usage. Contributors can delete their accounts at any time, leading to anonymization of their data. We are GDPR compliant and ISO 27001 certified.
How do you determine pay rates for your participants in various locales?
Our pay policy ensures at least minimum wage, and in some cases, living wages. Rates depend on factors such as skill set and ability to attract contributors. Higher skills (e.g., medical collections) necessitate higher pay.
What are the terms of the Data License?
Defined.ai datasets are covered by a standard license agreement (link provided in the FAQ). The license is perpetual and allows commercialization of models built using the data.
What is Spontaneous IVR data and how it is gathered?
Spontaneous IVR data is gathered by having a human respond to an IVR system, following real-life scenarios. The human repeats their query in different ways. The speech is transcribed. The recording is done via telephony (8khz 16 bit per channel).
What is Spontaneous Dialog Data and how it is gathered?
Spontaneous Dialog data involves crowd members following pre-studied scenarios and recording conversations. One plays the agent, the other a customer with spontaneous content. Recording is done via telephony (8khz 16 bit per channel) and transcribed.
What is Scripted Monologue data and how it is gathered?
Scripted Monologue data involves speakers reading aloud from a given prompt. Clients receive the audio, prompt, and speaker information. Audio is recorded on-device (typically 16khz 16 bit). Device information is also provided.
If I buy 200h of data, does it mean I will get 200h of pure speech?
Audio duration is measured. Scripted speech includes pre- and post-reading silence. Dialogue speech generally has little silence except for natural breaks. For IVR, human speech segments comprise about 50% of the audio duration.
Can I get a sample of a dataset?
Free samples are available for download on the website.
Can you package subsets of data for me according to specific requirements of age, gender and accent?
Yes, custom datasets can be packaged based on specific requirements such as age, gender, and accent.
I need data that is not listed on the marketplace. Can you help me with my request?
We can help by either creating a custom collection or by informing about datasets planned for the future that may fulfill the requirements.
What are the payment options?
USD via ACH bank transfer. Purchase orders, SOWs, and other documentation are available upon request.
When will my purchased assets be delivered?
Datasets are delivered after payment is received. ACH transfers require cleared funds (2-3 business days). Custom orders may take longer.
Are there specific terms for Academia?
Yes, datasets are offered with significant discounts or even for free to Academia after a due diligence process.
Do you offer discounts?
Yes, discounts are available based on data volume. Contact us for a quotation.
Job Opportunities
AI/ML Sales Executive (US)
Defined.ai offers a large marketplace for high-quality, ethically sourced AI training data, providing diverse datasets and custom data services.
Benefits:
Flexible working schedule and hybrid model
Excellent career development opportunities
Culture of feedback and continuous improvement
International and diverse team
Continuous training opportunities
Education Requirements:
Bachelor's degree or equivalent
Experience Requirements:
6+ years of proven experience working as a Sales Executive selling Professional Services / Data / Customized Projects / Consultative Sales into Enterprise accounts (B2B)
Other Requirements:
Proficient with Salesforce / CRM and MS Office
Ability to communicate, present and influence all levels of the organization, including executives
Strong ability to handle directly and close complex deals above $1M
Knowledge in AI/ML
Technical Sales experience will be a plus
Responsibilities:
Hunting for new logos in the assigned Enterprise verticals
Expanding the company’s footprint in existing enterprise or strategic accounts
Managing enterprise and or strategic customers with significant deal sizes $500k-$5M
Creating organic revenue streams working with the solutions and customer success teams within assigned territories/regions
Supporting and collaborating with internal partners to build successful proof of concepts, use cases and RFPs etc
Show more details
B2B Technical Writer
Defined.ai offers a large marketplace for high-quality, ethically sourced AI training data, providing diverse datasets and custom data services.
Benefits:
Flexible working schedule and hybrid model
Excellent career development opportunities
Culture of feedback and continuous improvement
International and diverse team
Continuous training opportunities
Experience Requirements:
5+ years of B2B technical writing experience
Other Requirements:
Strong understanding of AI concepts
Exceptional writing skills
Ability to work effectively with cross-functional teams
Knowledge of SEO best practices
Responsibilities:
Write AI-focused B2B content
Collaborate with product and engineering teams
Support marketing team by developing content
Ensure content is relevant and localized
Implement SEO best practices
Show more details
Backend Engineer
Defined.ai offers a large marketplace for high-quality, ethically sourced AI training data, providing diverse datasets and custom data services.
Benefits:
Flexible working schedule and hybrid model
Excellent career development opportunities
Culture of feedback and continuous improvement
International and diverse team
Continuous training opportunities
Education Requirements:
BSc or MSc in Computer Science or similar background
Experience Requirements:
Mid to senior-level of .Net C# and software quality best practices
Other Requirements:
Experience with working with Agile software development methodologies
Worked with Azure services such as DevOps, Kubernetes and Blob Storage
Deep understanding of a fully automated software development lifecycle via CI/CD pipelines
Comfortable with applying software design and architectural patterns/principles
Accustomed to working with microservices in .Net C#, MS SQL Server and RabbitMQ
Knowledge of RESTful APIs
Familiarity with shell scripting
Proficient in both written and spoken English
Responsibilities:
Work on the back-end side of our platform by developing tools to automate workloads for data collection and processing of AI training datasets
Develop and evolve a microservice- and event-driven architecture based mainly on .Net C#, SQL Server, and RabbitMQ
Own the entire lifecycle (from conception to release and maintenance) of the services and applications your team owns
Be working in a multidisciplinary (QA, Back- and Front-end Engineers, Product Managers, etc.) and multicultural Agile team
Collaborate with the Product, Architecture, Infrastructure, and DevOps teams as well
Show more details
Ratings & Reviews
No ratings available yet. Be the first to rate this tool!
Alternatives
Crustdata
Crustdata is a real-time company and people data provider that fuels commercial, internal, and sales platforms, specifically designed to power AI intelligence layers.
View DetailsSurfingTech
SurfingTech is an AI data provider with deep industry know-how, offering tailored AI datasets for multi-ethnicities, multimodalities, and diverse expressions.
View DetailsMage Data
Mage Data is a comprehensive platform for secure data provisioning and Test Data Management 2.0, focusing on data privacy, security, and compliance for enterprises.
View DetailsAleno
Aleno is a real-time on-chain market data provider for any chain, protocol, or token, offering unmatched accuracy and reliability for market insights.
View DetailsBytient
Bytient is a B2B data provider leveraging forensic intelligence, predictive analytics, and machine learning to offer deep data on any organization.
View DetailsFeatured Tools
GirlfriendGPT
NSFW AI chat platform with customizable characters, AI image generation, and voice chat. Explore roleplay and intimate interactions with AI companions.
View DetailsPDF Translator
PDF Translator is an AI-powered tool for instant document translations. Upload PDFs, select from 100+ languages, and get format-preserving translations for free.
View DetailsWan 2.2 Animate
Wan 2.2 Animate is a free online AI tool that transforms any character with advanced AI-powered animations, precise facial expressions, and dynamic body movements without registration.
View DetailsDeVoice
DeVoice is an AI-powered audio and video tool that offers unlimited, accurate transcription, AI rap generation, and background noise removal capabilities.
View DetailsDeepSwapAI
DeepSwapAI is a professional AI face swap platform for developers, offering enterprise-grade face exchange technology with RESTful API, SDKs, and batch processing.
View DetailsFace Swap AI
Face Swap AI is a free AI tool for instant face swapping in photos and videos, delivering stunning HD results without signup or watermarks for creative projects.
View DetailsStoryShort
StoryShort is an AI creation tool that helps you create viral faceless videos on auto-pilot, generating engaging content in minutes.
View DetailsAIhumanize
AIhumanize is an advanced AI humanizer tool that transforms AI-written text into natural, authentic writing, helping you bypass all major AI detectors.
View DetailsLoveGen AI
LoveGen AI is an all-in-one platform integrating major image and video AI models, enabling creation from text, visual enhancement, and video generation.
View DetailsCapacity
Capacity is an AI tool that helps you turn any idea into a working web app, including fullstack applications and cloned websites, without writing code.
View DetailsNano Banana Pro
Nano Banana Pro is a reasoning-first 4K AI image editor designed for creative teams to generate lossless 4K visuals, transparent PNGs, and high-quality exports.
View DetailsImageTranslator
ImageTranslator is an AI-powered online tool that translates text in images instantly, supporting over 100 languages while preserving original layout.
View DetailsSeedance 2
Seedance 2 is a groundbreaking AI video generation technology that delivers 1080p cinematic quality with advanced motion synthesis and multi-shot storytelling.
View DetailsKissGen AI
KissGen AI is the best AI kissing video generator, transforming memories into lifelike kissing videos with realistic animations and custom styles.
View DetailsGempix2 AI
Gempix2 AI is a free online AI photo and image editor, powered by NanoBanana 2 technology, offering advanced tools for professional-quality visual transformations.
View DetailsAI Animate Image
AI Animate Image revolutionizes how you create animated content from static images. Our advanced AI image animator turns photos into animation with stunning realism.
View DetailsWan 2.2
Wan 2.2 is an open-source AI video generation tool using MoE architecture, transforming text or images into professional 720P cinematic videos.
View Details