Data Type Mapper
Map data types between different programming languages and data formats
Understanding Data Type Mapping
Data type mapping is crucial when transferring data between different systems, as each platform has its own type system with specific behaviors, sizes, and constraints.
Common Data Types Across Systems
String Types
- SQL: VARCHAR, CHAR, TEXT - Fixed or variable-length character strings
- Python: str - Unicode strings, immutable
- JSON: string - UTF-8 encoded text
- Java: String - Immutable object reference
Integer Types
- SQL: INT (4 bytes), BIGINT (8 bytes), SMALLINT (2 bytes), TINYINT (1 byte)
- Python: int - Arbitrary precision integer
- JSON: number - No distinction between int and float
- Pandas: int64, int32, int16, int8 - Sized integers
Floating Point Types
- SQL: FLOAT (4 bytes), DOUBLE (8 bytes), DECIMAL (exact precision)
- Python: float (64-bit), Decimal (arbitrary precision)
- JSON: number - IEEE 754 double precision
Boolean Types
- SQL: BOOLEAN or BIT
- Python: bool (True/False)
- JSON: boolean (true/false)
Date/Time Types
- SQL: DATE, DATETIME, TIMESTAMP, TIME
- Python: date, datetime, time (from datetime module)
- JSON: string (ISO 8601 format)
- Pandas: datetime64 - Numpy datetime type
Important Considerations
Precision Loss
Be careful when converting between types with different precisions. For example:
- SQL DECIMAL to Python float may lose precision
- Python int to SQL INT may overflow for large numbers
- JSON numbers may lose precision for very large integers
Null Handling
Different systems handle null/missing values differently:
- SQL: NULL - Special marker for missing data
- Python: None - Singleton object representing absence
- JSON: null - Explicit null value
- Pandas: NaN, NaT, None - Multiple representations
String Encoding
Always be aware of character encoding when working with strings:
- Python strings are Unicode by default
- SQL VARCHAR may have different encodings (UTF-8, Latin-1, etc.)
- JSON requires UTF-8 encoding
| VARCHAR | Variable string |
| INT | 32-bit integer |
| BIGINT | 64-bit integer |
| FLOAT | 32-bit decimal |
| DOUBLE | 64-bit decimal |
| BOOLEAN | True/False |
| DATE | Date only |
| TIMESTAMP | Date and time |
| BLOB | Binary data |
| JSON | JSON document |
| str | Unicode string |
| int | Arbitrary precision |
| float | 64-bit decimal |
| bool | True/False |
| dict | Key-value pairs |
| list | Ordered collection |
| datetime | Date and time |
| bytes | Binary data |
| Decimal | Exact precision |