Class BroadcastableSchemaInfo

  • All Implemented Interfaces:
    java.io.Serializable

    public final class BroadcastableSchemaInfo
    extends java.lang.Object
    implements java.io.Serializable
    Broadcastable wrapper for schema information with ZERO transient fields to optimize Spark broadcasting.

    Contains BroadcastableTableSchema (pre-computed schema data) and UDT statements. Executors reconstruct CassandraSchemaInfo and TableSchema from these fields.

    Why ZERO transient fields matters:
    Spark's SizeEstimator uses reflection to estimate object sizes before broadcasting. Each transient field forces SizeEstimator to inspect the field's type hierarchy, which is expensive. Logger references are particularly costly due to their deep object graphs (appenders, layouts, contexts). By eliminating ALL transient fields and Logger references, we:

    • Minimize SizeEstimator reflection overhead during broadcast preparation
    • Reduce broadcast variable serialization size
    • Avoid accidental serialization of non-serializable objects
    See Also:
    Serialized Form
    • Method Detail

      • from

        public static BroadcastableSchemaInfo from​(@NotNull
                                                   SchemaInfo source)
        Creates a BroadcastableSchemaInfo from a source SchemaInfo. Extracts BroadcastableTableSchema to avoid serializing Logger.
        Parameters:
        source - the source SchemaInfo (typically CassandraSchemaInfo)
      • getUserDefinedTypeStatements

        @NotNull
        public java.util.Set<java.lang.String> getUserDefinedTypeStatements()