Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reading BigQuery Table Data into Java Classes(Pojo) Directly #19412

Open
kennknowles opened this issue Jun 3, 2022 · 1 comment
Open

Reading BigQuery Table Data into Java Classes(Pojo) Directly #19412

kennknowles opened this issue Jun 3, 2022 · 1 comment

Comments

@kennknowles
Copy link
Member

While Developing my code I used the below code snippet to read the table data from BigQuery.

 


PCollection<ReasonCode> gpseEftReasonCodes = input
      .apply("Reading xxyyzz", 
          BigQueryIO

                 .read(new ReadTable<ReasonCode>(ReasonCode.class))
                  .withoutValidation()

                 .withTemplateCompatibility()
                  .fromQuery("Select * from dataset.xxyyzz")

                 .usingStandardSql()
                  .withCoder(SerializableCoder.of(xxyyzz.class))

Read Table Class:



@DefaultSchema(JavaBeanSchema.class)
public class ReadTable<T> implements SerializableFunction<SchemaAndRecord,
T> {
  private static final long serialVersionUID = 1L;
  private static Gson gson = new Gson();

 public static final Logger LOG = LoggerFactory.getLogger(ReadTable.class); private final Counter countingRecords
= 
  Metrics.counter(ReadTable.class, "Reading Records EFT Report");
  private Class<T> class1;

 
  public ReadTable(Class<T> class1) { this.class1 = class1; }
 
  public T apply(SchemaAndRecord
schemaAndRecord) {
    Map<String, String> mapping = new HashMap<>();
    int counter = 0;
    try
{
      GenericRecord s = schemaAndRecord.getRecord();
      org.apache.avro.Schema s1 = s.getSchema();

     for (Field f : s1.getFields()) {
        counter++;
        mapping.put(f.name(), null==s.get(f.name())
? null : String.valueOf(s.get(counter)));
      }
      countingRecords.inc();
      JsonElement
jsonElement = gson.toJsonTree(mapping);
      return gson.fromJson(jsonElement, class1);
    } catch
(Exception mp) {
      LOG.error("Found Wrong Mapping for the Record: "+mapping); mp.printStackTrace();
return null; }
    }
}


So After Reading the data from Bigquery I was mapping data from SchemaAndRecord to pojo I was getting value for columns whose Data type is Numeric mention below.


last_update_amount=java.nio.HeapByteBuffer[pos=0 lim=16 cap=16]

My Expectation was I will get exact value but getting the HyperByte Buffer the version I am using is Apache beam 2.12.0. If any more information is needed then please let me know.

Way 2 Tried:


GenericRecord s = schemaAndRecord.getRecord();
org.apache.avro.Schema s1 = s.getSchema();
for (Field
f : s1.getFields()) {
  counter++;
  mapping.put(f.name(), null==s.get(f.name()) ? null : String.valueOf(s.get(counter)));

 if(f.name().equalsIgnoreCase("reason_code_id")) {
    BigDecimal numericValue = new Conversions.DecimalConversion()

      .fromBytes((ByteBuffer) s.get(f.name()), Schema.create(s1.getType()), s1.getLogicalType());

      System.out.println("Numeric Con"+numericValue);
} else {
  System.out.println("Else Condition
"+f.name());
}

Facing Issue:


2019-05-24 (14:10:37) org.apache.avro.AvroRuntimeException: Can't create a: RECORD

 

It would be Great if we have a method which maps all the BigQuery Data with Pojo Schema which Means if I have 10 Columns in BQ and in my Pojo I need only 5 Column then, in that case, BigQueryIO should map only that 5 Data values into Java Class and Rest will be Rejected As I am Doing After So much Effort.
Numeric Data Type must be Deserialize by itself while fetching data like TableRow.

 

Imported from Jira BEAM-7425. Original Jira may contain additional context.
Reported by: KishanK.

@Ganeshsivakumar
Copy link

Hi @damccorm is this issue active.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants