Implementation:DataTalksClub Data engineering zoomcamp Java Ride Data Model
| Knowledge Sources | |
|---|---|
| Domains | Streaming, Kafka |
| Last Updated | 2026-02-09 00:00 GMT |
Overview
Ride is a Java data model class representing a NYC taxi ride record, with public fields for all ride attributes and a CSV-parsing constructor that maps a String[] row into a fully populated Ride object.
Description
The Ride class in the org.example.data package serves as the core data model used across all Java Kafka examples in the streaming module. It is a plain Java object (POJO) with public fields and two constructors:
- Ride(String[] arr) -- Parses a CSV row (as a String[]) into a Ride object, mapping 18 columns to their corresponding fields:
- arr[0] to VendorID (String)
- arr[1] to tpep_pickup_datetime (LocalDateTime, parsed with "yyyy-MM-dd HH:mm:ss" format)
- arr[2] to tpep_dropoff_datetime (LocalDateTime)
- arr[3] to passenger_count (int)
- arr[4] to trip_distance (double)
- arr[5] to RatecodeID (long)
- arr[6] to store_and_fwd_flag (String)
- arr[7] to PULocationID (long)
- arr[8] to DOLocationID (long)
- arr[9] to payment_type (String)
- arr[10]-arr[17] to fare-related fields: fare_amount, extra, mta_tax, tip_amount, tolls_amount, improvement_surcharge, total_amount, congestion_surcharge (all double)
- Ride() -- Default no-argument constructor, required for JSON deserialization by KafkaJsonDeserializer.
All fields are public, making the class compatible with both KafkaJsonSerializer (which serializes public fields) and KafkaJsonDeserializer (which sets public fields via reflection or direct access).
Usage
Use this data model whenever you need to represent a taxi ride record in the Java Kafka examples. It is consumed by JsonProducer (for producing rides), JsonConsumer (for consuming rides), and all Kafka Streams applications (JsonKStream, JsonKStreamJoins, JsonKStreamWindow) as the primary event type.
Code Reference
Source Location
- Repository: DataTalksClub_Data_engineering_zoomcamp
- File: 07-streaming/java/kafka_examples/src/main/java/org/example/data/Ride.java
- Lines: 1-49
Signature
public class Ride {
public String VendorID;
public LocalDateTime tpep_pickup_datetime;
public LocalDateTime tpep_dropoff_datetime;
public int passenger_count;
public double trip_distance;
public long RatecodeID;
public String store_and_fwd_flag;
public long PULocationID;
public long DOLocationID;
public String payment_type;
public double fare_amount;
public double extra;
public double mta_tax;
public double tip_amount;
public double tolls_amount;
public double improvement_surcharge;
public double total_amount;
public double congestion_surcharge;
public Ride(String[] arr)
public Ride()
}
Import
import org.example.data.Ride;
I/O Contract
Inputs
| Name | Type | Required | Description |
|---|---|---|---|
| arr | String[] | Yes (for CSV constructor) | A CSV row as a String array with 18 elements representing: VendorID, pickup datetime, dropoff datetime, passenger count, trip distance, ratecode ID, store and forward flag, PU location ID, DO location ID, payment type, fare amount, extra, MTA tax, tip amount, tolls amount, improvement surcharge, total amount, congestion surcharge. |
Outputs
| Name | Type | Description |
|---|---|---|
| Ride object | org.example.data.Ride | A fully populated Ride POJO with all 18 public fields set from the CSV row or via direct assignment. |
Usage Examples
Constructing from CSV Row
// Parse a CSV row into a Ride object
String[] csvRow = {"1", "2023-01-15 08:30:00", "2023-01-15 08:45:00",
"2", "3.5", "1", "N", "142", "236", "1",
"15.00", "0.50", "0.50", "3.00", "0.00", "0.30", "19.30", "2.50"};
Ride ride = new Ride(csvRow);
Reading Rides from CSV File
// Typical pattern used by JsonProducer and AvroProducer
var ridesStream = this.getClass().getResource("/rides.csv");
var reader = new CSVReader(new FileReader(ridesStream.getFile()));
reader.skip(1); // Skip header
List<Ride> rides = reader.readAll().stream()
.map(arr -> new Ride(arr))
.collect(Collectors.toList());
Default Constructor for Deserialization
// The no-arg constructor is used implicitly by KafkaJsonDeserializer
// when consuming Ride objects from Kafka topics
props.put(KafkaJsonDeserializerConfig.JSON_VALUE_TYPE, Ride.class);
KafkaConsumer<String, Ride> consumer = new KafkaConsumer<>(props);