ArcadeDB Embedded Python Bindings

HumemAI ArcadeDB Python Open Source Databases Graphs Vectors Embeddings JVM

Meet ArcadeDB’s embedded Python bindings

We’ve released ArcadeDB Embedded Python Bindings, a lightweight way to use ArcadeDB directly from Python without a driver hop. This is the foundation for much of HumemAI’s R&D work and future products.

Embedded (JVM) access for low‑latency local workloads.
DSL-first workflows from Python using SQL and OpenCypher.
Self-contained packaging with bundled JRE and JARs.

Why ArcadeDB

ArcadeDB is a high‑performance, multi‑model database built for extreme efficiency. It supports documents, graphs, key/value, full‑text, and vector embeddings in one engine, with ACID transactions and multiple query languages (SQL, OpenCypher, MongoDB queries). It’s Java‑based, which is powerful but not always easy to adopt in the AI world where workflows are dominated by scripting languages like Python—one of the reasons we built the embedded bindings.

Repo: ArcadeDB on GitHub
Docs: ArcadeDB Documentation

For us, ArcadeDB’s multi‑model design is the right substrate for memory systems that need structured relationships and fast retrieval at scale.

Why Embedded Python Bindings

HumemAI favors local‑first and low‑latency data access. A thin embedded binding removes the networked driver hop and lets Python talk to the JVM engine directly. This improves latency, reduces operational overhead, and makes it easier to ship reliable, testable agents.

It also means we can keep memory-intensive workloads close to the data, while still using Python to drive lifecycle, transactions, parameters, and experiments.

What’s Included

The repository contains the original ArcadeDB codebase plus the Python bindings under bindings/python. The bindings expose a Pythonic API over core ArcadeDB functionality:

The current examples lean DSL-first rather than leading with Python object wrappers:

SQL DDL/DML from Python for schema, ingest, and analytics
OpenCypher for graph pattern queries, with SQL MATCH also available
Vector indexing and search through SQL features like LSM_VECTOR and vectorNeighbors(...)
Transactions, batch workloads, and importer/exporter utilities
Embedded runtime with optional server-mode access when needed

In other words, Python is increasingly the orchestration layer, while SQL and OpenCypher carry most of the data work.

A truly standalone wheel

The Python wheel is fully self‑contained. It ships with everything needed to run ArcadeDB embedded inside your Python process:

Lightweight JVM 25 built with jlink (no external Java install)
JPype bridge to call JVM APIs from Python
Only the required JARs, plus the Python bindings code
Prebuilt platform-specific wheels for Linux x86_64, Linux ARM64, macOS ARM64, and Windows x86_64

Despite bundling the JVM, we keep the package compact: the wheel is about 74MB compressed today, with slight variation by platform and version.

Quick Start (Python)

Install the package and create a local database:

uv pip install arcadedb-embedded

Tables (SQL), Graphs (OpenCypher), and Vectors

ArcadeDB gives us all three data models in one embedded engine. The current bindings examples lead with SQL and OpenCypher from Python, rather than a large Python-native object model.

Tables with SQL

We define schema, insert records, and query through SQL.

import arcadedb_embedded as arcadedb

with arcadedb.create_database("./mydb") as db:
    db.command("sql", "CREATE DOCUMENT TYPE User")
    db.command("sql", "CREATE PROPERTY User.name STRING")
    db.command("sql", "CREATE PROPERTY User.age INTEGER")
    db.command("sql", "CREATE INDEX ON User (age) NOTUNIQUE")

    with db.transaction():
        db.command("sql", "INSERT INTO User SET name = ?, age = ?", "Ada", 31)
        db.command("sql", "INSERT INTO User SET name = ?, age = ?", "Grace", 28)

    for row in db.query("sql", "SELECT name FROM User WHERE age >= ? ORDER BY age DESC", 30):
        print(row.get("name"))

Graphs with OpenCypher

We can still use SQL for graph DDL and edge creation, then query through OpenCypher.

import arcadedb_embedded as arcadedb

with arcadedb.create_database("./mydb") as db:
    db.command("sql", "CREATE VERTEX TYPE Person")
    db.command("sql", "CREATE PROPERTY Person.name STRING")
    db.command("sql", "CREATE EDGE TYPE KNOWS")

    with db.transaction():
        db.command("sql", "INSERT INTO Person SET name = ?", "Alice")
        db.command("sql", "INSERT INTO Person SET name = ?", "Bob")
        db.command(
            "sql",
            "CREATE EDGE KNOWS FROM (SELECT FROM Person WHERE name = ?) TO (SELECT FROM Person WHERE name = ?)",
            "Alice",
            "Bob",
        )

    result = db.query(
        "opencypher",
        "MATCH (a:Person)-[:KNOWS]->(b:Person) RETURN a.name AS from, b.name AS to",
    )
    for r in result:
        print(r.get("from"), "->", r.get("to"))

Vectors (HNSW / JVector)

Vector workflows are also driven through SQL, with Python mainly supplying values and parameters.

import arcadedb_embedded as arcadedb

with arcadedb.create_database("./mydb") as db:
    db.command("sql", "CREATE VERTEX TYPE VecDoc")
    db.command("sql", "CREATE PROPERTY VecDoc.name STRING")
    db.command("sql", "CREATE PROPERTY VecDoc.vector ARRAY_OF_FLOATS")
    db.command(
        "sql",
        """
        CREATE INDEX ON VecDoc (vector)
        LSM_VECTOR
        METADATA {"dimensions": 4, "similarity": "COSINE"}
        """,
    )

    with db.transaction():
        db.command(
            "sql",
            "INSERT INTO VecDoc SET name = :name, vector = :vector",
            {"name": "Apple", "vector": arcadedb.to_java_float_array([1.0, 0.0, 0.0, 0.0])},
        )
        db.command(
            "sql",
            "INSERT INTO VecDoc SET name = :name, vector = :vector",
            {"name": "Banana", "vector": arcadedb.to_java_float_array([0.9, 0.1, 0.0, 0.0])},
        )
        db.command(
            "sql",
            "INSERT INTO VecDoc SET name = :name, vector = :vector",
            {"name": "Car", "vector": arcadedb.to_java_float_array([0.0, 0.0, 1.0, 0.0])},
        )

    results = db.query(
        "sql",
        "SELECT name, distance FROM (SELECT expand(vectorNeighbors(?, ?, ?))) ORDER BY distance",
        "VecDoc[vector]",
        arcadedb.to_java_float_array([0.95, 0.05, 0.0, 0.0]),
        2,
    )
    for row in results:
        print(row.get("name"), row.get("distance"))

Use Cases We Care About

Memory graphs for agents (long‑term + episodic knowledge)
Vector‑augmented retrieval with structured context
Local developer environments that mirror production behavior
Embedded analytics without running a separate DB server

What’s Next

We’ll share tutorials, benchmarks, and deeper integration notes as we deepen our use of ArcadeDB across HumemAI. We’re also working to unleash its full capability for research, development, and productization across our memory stack. If you’re building similar systems, we’d love to hear from you.