Skip to main content

Questions tagged [tensorflow-transform]

TensorFlow Transform (tf.transform) is a library for data preprocessing with TensorFlow. It enables you to define and execute distributed pre-processing or feature engineering functions on large data sets, and then export the same functions as a TensorFlow graph for re-use during training or serving. It also comes with pre-implemented functions for common tasks like normalization, vocabulary generation and bucketization.

tensorflow-transform
0 votes
0 answers
16 views

Using tft.scale_to_gaussian for preprocessing a dataset without using other tensorflow operations

I'm working on a project where I have a set of longtail data that I want to transform into a Gaussian distribution. I'm looking to achieve something similar to scikit-learn's PowerTransformer, but ...
umut's user avatar
  • 1
0 votes
1 answer
37 views

Dataflow Tensorflow Transform write transformed data to BigQuery

In a GCP Dataflow pipeline, I am trying to write the transformed data from Transform component into Bigquery and I get the error below. First I would appreciate if someone could let me know if there ...
crbl's user avatar
  • 395
1 vote
0 answers
35 views

Creating Tensors from features that are linked together

I have a set of multi valued features which are linked together. As an example, ItemCodes Scores AK, NA, UY 0.6, 0.2, 0.2 KG, AK 0.5, 0.5 Each Item has a corresponding score associated with it. ...
Utkarsh D's user avatar
0 votes
0 answers
25 views

TensorFlow Transform unexpected behavior while using tf.strings.unicode_split

I am trying to use TensorFlow transform (1.13.0) and TensorFlow (2.12.1) as part of my pipeline and noticed that it doesn't return the correct answer. This is what i am running: with beam.Pipeline() ...
ramin's user avatar
  • 31
0 votes
0 answers
29 views

universal sentence encoder batch pipeline failing

I have a batch job on DataFlow runner to calculate the embedding from the input text. Through the journey of pipeline. I am using tft.impl.context and impl.AnalyzeAndTransformDataset for the same Here ...
Akarsh Jain's user avatar
2 votes
1 answer
834 views

tensorflow_transform installation failure on Mac M2

According to Can't install due to dependency on numpy #289, TenforFlow Transform (tft) supports Python 3.9 and there is no limitation for Mac OS on Apple silicon stated in TensorFlow Transform github. ...
mon's user avatar
  • 21.2k
0 votes
1 answer
676 views

Dealing with missing values in tensorflow

I need some guidance on the approach to imputation in tensorflow/deep learning. I am familiar with how scikit-learn handles imputation, and when I map it to the tensorflow ecosystem, I would expect ...
Pritam Dodeja's user avatar
0 votes
1 answer
221 views

Transforming tensorflow datasets to beam datasets

There are a variety of ways to get a dataset you can train on in tensorflow. One of the things tensorflow transform does is provide the ability to do preprocessing via AnalyzeAndTransformDataset and ...
Pritam Dodeja's user avatar
1 vote
1 answer
228 views

Add reserved tokens to `tft.vocabulary`

I would like to append words to the vocabulary created by tft.vocabulary that are not a part of the training samples (i.e. <mask> and <pad> tokens). I see in the docs that the tft....
Zach Robertson's user avatar
1 vote
1 answer
531 views

apache beam rows to tfrecord in order to GenerateStatistics

I have built a pipeline that read some data, does some manipulations and create some apache beam Row objects (Steps 1 and 2 in the code below). I then would like to generate statistic and write them ...
DarioB's user avatar
  • 1,545
0 votes
1 answer
141 views

join datasets with tfx tensorflow transform

I am trying to replicate some data preprocessing that I have done in pandas into tensorflow transform. I have a few CSV files, which I joined and aggregated with pandas to produce a training dataset. ...
DarioB's user avatar
  • 1,545
0 votes
1 answer
815 views

How to get vocabulary size in tensorflow_transform before apply_vocabulary?

Also posted the question at https://github.com/tensorflow/transform/issues/261 I am using tft in TFX and needs to transform string list class labels into multi-hot indicators inside preprocesing_fn. ...
ynait's user avatar
  • 1
1 vote
0 answers
190 views

How can I use BigQuery in a standalone tensorflow transform (TFT) pipeline?

I'm interested in interactive development of a preprocessing_fn for tft.AnalyzeAndTransformDataSet. By interactive development, I mean running a standalone beam pipeline in a Jupyter Notebook and ...
jb_ml_eng's user avatar
1 vote
1 answer
214 views

Tensorflow Extended (TFX): Is there an easy way to debug functions from Transorm component?

I am supposed to modify a function which is a part of Transorm component. It is a long series of tensorflow operations and I am not sure a. how particular steps affect processed variables b. what does ...
Brzoskwinia's user avatar
1 vote
1 answer
378 views

How do I pass a TensorFlow Dataset through a TensorFlow Transform pipeline?

I have implemented a custom TensorFlow Dataset for my raw data. I can download, prepare, and load the data as a tensorflow.data.Dataset as follows: import tensorflow_datasets builder = ...
Pierce Edmiston's user avatar

15 30 50 per page
1
2 3 4 5 6