gfw.common.beam.transforms.GroupBy#
- class GroupBy(*fields, elements='', dict_fields=True, **kwargs)[source]#
Wrapper around
beam.GroupBywith automatic labeling.This transform wraps Beam’s native
beam.GroupByand adds an automatically generated label based on the grouping keys. For example, grouping by [“user”, “country”] with elements=”Sessions” results in a label likeGroupSessionsByUserAndCountry.If
dict_fields=True(default), string positional fields are interpreted as dictionary keys and wrapped withoperator.itemgetter(). If False, strings are treated as attribute names.Example
pcoll | GroupBy("user", "country", elements="Sessions")
- Parameters:
*fields (Any) – Positional key fields to group by. If these are strings and
dict_fields=True, they will be interpreted as dictionary keys.elements (str) – A human-readable label describing the grouped elements (e.g.,
MessagesorSessions). It is used to generate the step label.dict_fields (bool) – If True (default), string fields are interpreted as dictionary keys and wrapped with
operator.itemgetter(). Set to False to use Beam’s default behavior (attribute access).**kwargs (Any) – Same as
beam.GroupByinterface.
Methods
annotationsGenerate a descriptive label for the GroupBy transform based on keys and elements.
default_labeldefault_type_hintsReturns the display data associated to a pipeline component.
Applies the wrapped Beam GroupBy transform to the input PCollection.
from_runner_apiget_resource_hintsGets and/or initializes type hints for this object.
Returns the window function to be associated with transform's output.
infer_output_typeregister_urnrunner_api_requires_keyed_inputto_runner_apito_runner_api_parameterto_runner_api_pickledtype_check_inputstype_check_inputs_or_outputstype_check_outputsAnnotates the input type of a
PTransformwith a type-hint.Annotates the output type of a
PTransformwith a type-hint.Adds resource hints to the
PTransform.Attributes
labelpipelineside_inputs- classmethod create_label(keys, elements)[source]#
Generate a descriptive label for the GroupBy transform based on keys and elements.
Constructs a label string combining the human-readable element description and the grouping keys, formatted in a CamelCase style joined by ‘And’.
For example, keys
['user', 'country']and elements ‘Sessions’ result inGroupSessionsByUserAndCountry.