Jump to content

Connect Leeroopedia MCP: Equip your AI agents to search best practices, build plans, verify code, diagnose failures, and look up hyperparameter defaults.

Principle:Heibaiying BigData Notes HBase Table Creation

From Leeroopedia


Knowledge Sources
Domains NoSQL, Big_Data
Last Updated 2026-02-10 10:00 GMT

Overview

HBase tables are defined at creation time by their column families, which determine the physical storage layout and properties of the data they contain.

Description

In HBase, a table is a sparse, distributed, sorted map indexed by row key. Unlike relational databases where columns are defined in the schema, HBase tables are defined by their column families. A column family:

  • Groups related columns together under a common prefix (e.g., info:name, info:age share the info family).
  • Determines physical storage -- each column family is stored in its own set of HFiles on HDFS.
  • Carries configuration properties such as compression algorithm, block size, time-to-live (TTL), and maximum number of versions.

Table creation is an administrative operation performed through the Admin interface. The process in HBase 2.x uses the builder pattern:

  1. Obtain an Admin handle from the connection.
  2. Check whether the table already exists using admin.tableExists().
  3. Build ColumnFamilyDescriptor objects for each column family using ColumnFamilyDescriptorBuilder.
  4. Build a TableDescriptor using TableDescriptorBuilder, attaching the column family descriptors.
  5. Call admin.createTable(tableDescriptor).

Important considerations:

  • Column families should be few in number (typically 1-3). Having too many column families can degrade performance because each family triggers its own MemStore flush and compaction cycle.
  • Column families must be defined at table creation time, although they can be added or removed later via schema alteration (which is an expensive operation).
  • The table name must be unique within the HBase namespace.

Usage

Table creation is performed during application setup or schema migration. It is an idempotent-safe operation when guarded by an existence check. Common scenarios include:

  • Initial deployment of an application that requires specific HBase tables.
  • Automated provisioning scripts for development or testing environments.
  • Schema evolution when new column families are needed.

Theoretical Basis

The HBase table creation model reflects its column-family-oriented storage architecture:

Table "users"
  |
  |-- Column Family "info"     -> stored in HFiles under /hbase/data/default/users/info/
  |     |-- qualifier "name"
  |     |-- qualifier "email"
  |
  |-- Column Family "metrics"  -> stored in HFiles under /hbase/data/default/users/metrics/
        |-- qualifier "login_count"
        |-- qualifier "last_seen"

The builder pattern used in HBase 2.x replaces the deprecated HTableDescriptor and HColumnDescriptor classes from 1.x:

HBase 1.x (deprecated):
    HTableDescriptor + HColumnDescriptor

HBase 2.x (current):
    TableDescriptorBuilder + ColumnFamilyDescriptorBuilder

The builder pattern ensures immutability of the resulting descriptor objects and provides a fluent API for setting properties.

Related Pages

Implemented By

Page Connections

Double-click a node to navigate. Hold to expand connections.
Principle
Implementation
Heuristic
Environment