Cleora AI favicon

Cleora AI

Freemium
Cleora AI screenshot
Click to visit website
Feature this AI

About

Cleora AI is a general-purpose open-source model for efficient, scalable learning of stable and inductive entity embeddings for heterogeneous relational data. It can embed heterogeneous undirected graphs, heterogeneous undirected hypergraphs, text and other categorical array data, and any combination of the above. Key competitive advantages of Cleora: more than 197x faster than DeepWalk ~4x-8x faster than PyTorch-BigGraph (depends on use case) quality of results outperforming or competitive with other embedding frameworks like PyTorch-BigGraph, GOSH, DeepWalk, LINE can embed extremely large graphs & hypergraphs on a single machine

Platform
Web
Keywords
aiembeddingsmlgraphs
Task
data embedding

Features

efficient

star decomposition of hyper-edges creation of pairwise graphs for all pairs of entity types embedding of each graph

extreme parallelism and performance

dim-wise independence

cross-dataset compositionality

stable

updatable

inductive

FAQs

What should I embed?

Any entities that interact with each other, co-occur or can be said to be present together in a given context.

How should I construct the input?

What works best is grouping entities co-occurring in a similar context, and feeding them in whitespace-separated lines using `complex::reflexive` modifier is a good idea.

Can I embed users and products simultaneously, to compare them with cosine similarity?

No, this is a methodologically wrong approach, stemming from outdated matrix factorization approaches. What you should do is come up with good product embeddings first, then create user embeddings from them.

What embedding dimensionality to use?

The more, the better, but we typically work from _1024_ to _4096_. Memory is cheap and machines are powerful, so don't skimp on embedding size.

How many iterations of Markov propagation should I use?

Depends on what you want to achieve. Low iterations (3) tend to approximate the co-occurrence matrix, while high iterations (7+) tend to give contextual similarity.

How do I incorporate external information, e.g. entity metadata, images, texts into the embeddings?

Just initialize the embedding matrix with your own vectors coming from a VIT, setence-transformers, of a random projection of your numeric features.

My embeddings don't fit in memory, what do I do?

Cleora operates on dimensions independently. Initialize your embeddings with a smaller number of dimensions, run Cleora, persist to disk, then repeat.

Is there a minimum number of entity occurrences?

No, an entity `A` co-occuring just 1 time with some other entity `B` will get a proper embedding, i.e. `B` will be the most similar to `A`.

Are there any edge cases where Cleora can fail?

Cleora works best for relatively sparse hypergraphs. If all your hyperedges contain some very common entity `X`, e.g. a _shopping bag_, then it will degrade the quality of embeddings

How can Cleora be so fast and accurate at the same time?

Not using negative sampling is a great boon. By constructing the (sparse) Markov transition matrix, Cleora explicitly performs all possible random walks in a hypergraph in one big step (a single matrix multiplication).

Pricing Plans

Free
Free Plan

Unlimited public/private repositories

Dependabot security and version updates

2,000 CI/CD minutes/month

500MB of Packages storage

Issues & Projects

Community support

GitHub Copilot Access

GitHub Codespaces Access

Team
USD4.00 / per user/month

Everything included in Free, plus...

Access to GitHub Codespaces

Protected branches

Multiple reviewers in pull requests

Draft pull requests

Code owners

Required reviewers

Pages and Wikis

Environment deployment branches and secrets

3,000 CI/CD minutes/month Free for public repositories Use execution minutes with GitHub Actions to automate your software development workflows. Write tasks and combine them to build, test, and deploy any code project on GitHub. Minutes are free for public repositories. Learn more about billing 3,000 minutes/month Free for public repositories 2GB of Packages storage Free for public repositories Host your own software packages or use them as dependencies in other projects. Both private and public hosting available. Packages are free for public repositories. 2GB Free for public repositories Web-based support GitHub Support can help you troubleshoot issues you run into while using GitHub. Web-based support GitHub Support can help you troubleshoot issues you run into while using GitHub. GitHub Secret Protection Ensure your secrets stay secure. Mitigate risk associated with exposed secrets in your repositories, while preventing new leaks before they happen with push protection. GitHub Secret Protection Ensure your secrets stay secure. Mitigate risk associated with exposed secrets in your repositories, while preventing new leaks before they happen with push protection. GitHub Code Security Find and fix vulnerabilities in your code before they reach production. Prioritize your Dependabot alerts with automated triage rules. GitHub Code Security Find and fix vulnerabilities in your code before they reach production. Prioritize your Dependabot alerts with automated triage rules.

Enterprise
USD21.00 / per user/month

Everything included in Team, plus...

Data residency

Enterprise Managed Users

User provisioning through SCIM

Enterprise Account to centrally manage multiple organizations

Environment protection rules

Repository rules

Audit Log API

SOC1, SOC2, type 2 reports annually

FedRAMP Tailored Authority to Operate (ATO) Government users can host projects on GitHub Enterprise Cloud with the confidence that our platform meets the low impact software-as-a-service (SaaS) baseline of security standards set by our U.S. federal government partners. Government users can host projects on GitHub Enterprise Cloud with the confidence that our platform meets the low impact software-as-a-service (SaaS) baseline of security standards set by our U.S. federal government partners. SAML single sign-on Use an identity provider to manage the identities of GitHub users and applications. SAML single sign-on Use an identity provider to manage the identities of GitHub users and applications. Advanced auditing Quickly review the actions performed by members of your organization. Keep copies of audit log data to ensure secure IP and maintain compliance for your organization. Advanced auditing Quickly review the actions performed by members of your organization. Keep copies of audit log data to ensure secure IP and maintain compliance for your organization. GitHub Connect Share features and workflows between your GitHub Enterprise Server instance and GitHub Enterprise Cloud. GitHub Connect Share features and workflows between your GitHub Enterprise Server instance and GitHub Enterprise Cloud. 50,000 CI/CD minutes/month Free for public repositories Use execution minutes with GitHub Actions to automate your software development workflows. Write tasks and combine them to build, test, and deploy any code project on GitHub. Minutes are free for public repositories. 50,000 CI/CD minutes/month Free for public repositories 50GB of Packages storage Free for public repositories Host your own software packages or use them as dependencies in other projects. Both private and public hosting available. Packages are free for public repositories. 50GB Free for public repositories Premium support With Premium, get a 30-minute SLA on Urgent tickets and 24/7 web and phone support via callback request. With Premium Plus, get everything in Premium, assigned Customer Reliability Engineer and more. Learn more about Premium Support Premium support With Premium, get a 30-minute SLA on Urgent tickets and 24/7 web and phone support via callback request. With Premium Plus, get everything in Premium, assigned Customer Reliability Engineer and more. Learn more about Premium Support

Job Opportunities

There are currently no job postings for this AI tool.

Explore AI Career Opportunities

Social Media

Ratings & Reviews

No ratings available yet. Be the first to rate this tool!

Featured Tools

Songmeaning favicon
Songmeaning

Songmeaning uses AI to reveal the stories and meanings behind song lyrics. It offers lyric translation and AI music generation.

View Details
Whisper Notes favicon
Whisper Notes

Offline AI speech-to-text transcription app using Whisper AI. Supports 80+ languages, audio file import, and offers lifetime access with a one-time purchase. Available for iOS and macOS.

View Details
GitGab favicon
GitGab

Connects Github repos and local files to AI models (ChatGPT, Claude, Gemini) for coding tasks like implementing features, finding bugs, writing docs, and optimization.

View Details
nuptials.ai favicon
nuptials.ai

nuptials.ai is an AI wedding planning partner, offering timeline planning, budget optimization, vendor matching, and a 24/7 planning assistant to help plan your perfect day.

View Details
Make-A-Craft favicon
Make-A-Craft

Make-A-Craft helps you discover craft ideas tailored to your child's age and interests, using materials you already have at home.

View Details
Pixelfox AI favicon
Pixelfox AI

Free online AI photo editor with comprehensive tools for image, face/body, and text. Features include background/object removal, upscaling, face swap, and AI image generation. No sign-up needed, unlimited use for free, fast results.

View Details
Smart Cookie Trivia favicon
Smart Cookie Trivia

Smart Cookie Trivia is a platform offering a wide variety of trivia questions across numerous categories to help users play trivia, explore different topics, and expand their knowledge.

View Details