Home › Blog › JSON in Databases
Modern databases support storing and querying JSON natively. This guide covers how each major database handles JSON, which operators to use, how to index JSON fields, and when you should and should not reach for a JSON column.
Traditional relational databases store data in rigid, fixed-schema tables. Every row must have the same columns, and adding a new attribute requires an ALTER TABLE migration. For many real-world use cases, this is too inflexible:
The key insight: JSON columns work best for semi-structured data where the schema varies between rows. For structured, uniform data, a normalised relational schema is still faster and more maintainable. See DB-Engines ranking for a comparison of relational vs. document databases by popularity.
PostgreSQL offers two JSON types: json and jsonb. The jsonb type is almost always the right choice. It stores JSON in a decomposed binary format, which is faster to query and supports indexing. The plain json type stores the raw text and re-parses it on every access.
Read the full details in the PostgreSQL JSON documentation.
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
customer_id INTEGER NOT NULL,
metadata JSONB,
created_at TIMESTAMPTZ DEFAULT NOW()
);
-- Insert a row with JSON metadata
INSERT INTO orders (customer_id, metadata)
VALUES (42, '{"status": "pending", "tags": ["urgent", "vip"], "address": {"city": "London"}}');
| Operator | Meaning | Example |
|---|---|---|
| -> | Get value as JSON | metadata->'address' |
| ->> | Get value as text | metadata->>'status' |
| #> | Get nested value as JSON (path) | metadata#>'{address,city}' |
| #>> | Get nested value as text (path) | metadata#>>'{address,city}' |
| @> | Contains (left contains right) | metadata @> '{"status":"pending"}' |
| ? | Key exists | metadata ? 'tags' |
-- Find orders with status = 'pending'
SELECT id, customer_id
FROM orders
WHERE metadata->>'status' = 'pending';
-- Find orders that contain a specific tag (array containment)
SELECT id
FROM orders
WHERE metadata @> '{"tags": ["urgent"]}';
-- Extract a nested field
SELECT id, metadata#>>'{address,city}' AS city
FROM orders;
-- Update a single JSON key without overwriting the whole column
UPDATE orders
SET metadata = jsonb_set(metadata, '{status}', '"shipped"')
WHERE id = 1;
A GIN (Generalized Inverted Index) on a jsonb column enables fast containment (@>) and existence (?) queries. Without it, every query scans the entire table.
-- GIN index for @> and ? operators
CREATE INDEX idx_orders_metadata ON orders USING GIN (metadata);
-- Functional B-tree index for a specific key (faster for equality/range on one field)
CREATE INDEX idx_orders_status ON orders ((metadata->>'status'));
-- EXPLAIN ANALYZE shows the index is used
EXPLAIN ANALYZE
SELECT id FROM orders WHERE metadata->>'status' = 'pending';
The GIN index is the most versatile — it covers any key in the JSON document. Use a functional index when you always query by the same specific key and need range comparisons (<, >, BETWEEN).
MySQL added a native JSON data type in version 5.7.8. It automatically validates that the stored value is valid JSON and provides a rich set of functions for querying and mutating JSON data. Full documentation is in the MySQL JSON documentation.
CREATE TABLE products (
id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(255) NOT NULL,
attrs JSON
);
INSERT INTO products (name, attrs)
VALUES ('Laptop', '{"brand": "Acme", "ram_gb": 16, "tags": ["portable", "work"]}');
-- Extract a value (returns quoted string for strings)
SELECT JSON_EXTRACT(attrs, '$.brand') AS brand FROM products;
-- Shorthand column path operator (MySQL 5.7.9+)
SELECT attrs->>'$.brand' AS brand FROM products;
-- Filter rows where ram_gb >= 16
SELECT name FROM products WHERE JSON_EXTRACT(attrs, '$.ram_gb') >= 16;
-- Check if a value exists in a JSON array
SELECT name FROM products WHERE JSON_CONTAINS(attrs->'$.tags', '"portable"');
-- Update one key inside the JSON
UPDATE products SET attrs = JSON_SET(attrs, '$.ram_gb', 32) WHERE id = 1;
MySQL does not support indexing the entire JSON column directly. Instead, use a generated column to extract a specific path, then index that column:
ALTER TABLE products
ADD COLUMN brand VARCHAR(100)
GENERATED ALWAYS AS (attrs->>'$.brand') STORED,
ADD INDEX idx_brand (brand);
MySQL 8.0 also supports multi-valued indexes on JSON arrays, enabling efficient MEMBER OF and JSON_CONTAINS queries on array elements.
Unlike PostgreSQL and MySQL — relational databases that added JSON support as a feature — MongoDB was designed from the ground up as a document database. Every record in MongoDB is a BSON document (Binary JSON), and collections are groups of documents with no enforced schema.
// mongosh / Node.js driver
db.orders.insertOne({
customerId: 42,
status: "pending",
tags: ["urgent", "vip"],
address: { city: "London", postcode: "SW1A 1AA" },
createdAt: new Date()
});
// Find all pending orders
db.orders.find({ status: "pending" });
// Query nested field
db.orders.find({ "address.city": "London" });
// Query array element
db.orders.find({ tags: "urgent" });
// Projection: return only specific fields
db.orders.find({ status: "pending" }, { customerId: 1, "address.city": 1 });
// Update a nested field
db.orders.updateOne(
{ _id: ObjectId("...") },
{ $set: { status: "shipped", "address.postcode": "EC1A 1BB" } }
);
MongoDB supports single-field, compound, multikey (array), text, and geospatial indexes. You can index any field in a document, including nested fields and array elements:
// Index on a top-level field
db.orders.createIndex({ status: 1 });
// Index on a nested field
db.orders.createIndex({ "address.city": 1 });
// Multikey index on an array field (created automatically for arrays)
db.orders.createIndex({ tags: 1 });
// Compound index
db.orders.createIndex({ status: 1, createdAt: -1 });
The same logical query — "find all orders with status = 'pending'" — looks different in each database. This table summarises the key differences:
| Operation | PostgreSQL JSONB | MySQL JSON | MongoDB |
|---|---|---|---|
| Read a field | metadata->>'status' | attrs->>'$.status' | doc.status |
| Filter by field value | WHERE metadata->>'status'='pending' | WHERE attrs->>'$.status'='pending' | find({status:"pending"}) |
| Nested field | metadata#>>'{address,city}' | attrs->>'$.address.city' | {"address.city": "London"} |
| Array contains value | metadata @> '{"tags":["vip"]}' | JSON_CONTAINS(attrs->'$.tags','"vip"') | find({tags:"vip"}) |
| Update one key | jsonb_set(col,'{key}',val) | JSON_SET(col,'$.key',val) | {$set:{key:val}} |
| Index type | GIN (whole doc) or functional B-tree | Generated column + B-tree | Single-field / compound / multikey |
| Schema enforcement | Optional (via CHECK constraints or triggers) | Optional (generated columns) | Optional (JSON Schema validation) |
JSON columns are convenient but come with trade-offs. Understanding them helps you make better decisions about when to use them.
Without an index, every JSON query does a full table scan. In PostgreSQL, a GIN index on a jsonb column covers any path with the @> operator. For specific, frequently-queried paths, a functional index on the extracted value is faster for equality and range comparisons. In MySQL, always use generated columns. In MongoDB, call createIndex() on any field you use in find() or sort().
JSON is verbose. A JSON column storing a 100-byte document takes more space than six separate integer/varchar columns encoding the same data, because it must store key names with every row. JSONB in PostgreSQL compresses and deduplicates keys internally, but it still uses more disk than a fully normalised schema. For tables with millions of rows, this adds up.
Queries on metadata#>>'{a,b,c,d}' are harder to index efficiently and often require custom functional indexes. If you find yourself querying deeply nested paths frequently, consider flattening the JSON structure or promoting those fields into regular columns.
In relational databases, joining on a JSON field (e.g., WHERE orders.metadata->>'customer_id' = customers.id::text) is significantly slower than joining on a native integer foreign key. Extract foreign keys into real columns and use JSON only for genuinely variable-structure data.
PostgreSQL's EXPLAIN ANALYZE shows exactly how the query planner executes a query, what indexes it uses, and how long each step takes. Always run it on queries that touch JSON columns before deploying to production.
JSON columns solve real problems but are frequently overused. Here are the situations where you should use a normalised relational schema instead:
SUM, AVG, GROUP BY on a JSON field requires casting and is slower than on a native column.A common mistake is treating JSON columns as a schema-free escape hatch. Start with a proper relational design and introduce JSON only for the specific attributes that are genuinely variable. Resources like MDN Web Docs also cover broader database security considerations worth reviewing before storing user-supplied JSON.
Yes — for semi-structured or variable-schema data. JSON columns are ideal for fields where the shape varies per row: user preferences, product attributes, event payloads. For structured data with fixed columns, normalise into regular relational tables for better query performance and referential integrity.
json stores the raw text as-is, preserving whitespace and key order, and re-parses on every access. jsonb stores a parsed binary representation: writes are slightly slower but reads, operators, and indexing are significantly faster. In practice, always use jsonb unless you specifically need to preserve whitespace or duplicate keys.
Yes. A GIN index on a jsonb column covers the entire document for containment (@>) and existence (?) queries. For a specific frequently-queried key, create a functional B-tree index: CREATE INDEX ON orders ((metadata->>'status'));
Yes. MySQL 5.7.8 added a native JSON data type with automatic validation and functions including JSON_EXTRACT, JSON_SET, JSON_REMOVE, and JSON_CONTAINS. MySQL 8.0 added multi-valued indexes on JSON arrays. Read the MySQL JSON documentation for the full function reference.
MongoDB is a document database built entirely around JSON-style documents (stored as BSON). Every record is a document. PostgreSQL is a relational database with an optional JSON column — you can mix relational and JSON storage in the same table. MongoDB excels at document-heavy workloads with variable schemas across collections. PostgreSQL JSONB is best when you have a mix of structured relational data and semi-structured JSON attributes in the same rows.
Make sure your JSON is valid before inserting it into PostgreSQL, MySQL, or MongoDB. One syntax error can cause a database rejection or silent data corruption. Use the free validator — no account needed.
Validate JSON NowFurther reading: PostgreSQL JSON documentation | MySQL JSON documentation | MongoDB document model | MDN JSON reference | DB-Engines ranking
Also useful: What is JSON? | JSON Schema Tutorial | How to Validate JSON | JSON to SQL