Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tests for type conversions #15

Closed
morazow opened this issue Oct 2, 2018 · 14 comments
Closed

Add tests for type conversions #15

morazow opened this issue Oct 2, 2018 · 14 comments

Comments

@morazow
Copy link
Contributor

morazow commented Oct 2, 2018

Create a table in Exasol with all possible data types and assert that we can read different column types in Spark.

@jpizagno
Copy link
Contributor

@morazow @3cham
I would like to contribute to this project. Would this be a good ticket for me to start with? (I think I understand the ticket, but have a question)
thanks
Jim

@morazow
Copy link
Contributor Author

morazow commented Oct 25, 2018

Hello Jim,

Yes, feel free to take it.

The main idea here is to improve test coverage of type conversions. You can extend or create new Exasol table with various column types (integer, double, tinyint, string, etc) and make sure that connector can read such tables with correct Spark types.

You can also check how jdbc types are converted into Spark types here.

Please do not hesitate to ask me or @3cham.

@jpizagno
Copy link
Contributor

@morazow
Great. My approach to implementing this test would be to write a TypesSuite class in test package that tests the method Types.createSparkTypeFromSQLType(). It seems that ExasolRelation and createSparkStructType lead to createSparkTypeFromSQLType(). So if createSparkTypeFromSQLType() is tested, then we have tested the type conversions. correct?

thanks,
Jim

Yes, I can ask @3cham , since he is a few meters away from me.

@morazow
Copy link
Contributor Author

morazow commented Oct 26, 2018

Hello @jpizagno ,

Yes, that is correct.

However, you can also add a single integration test which reads a table with various types from Exasol (docker/exasol-db).

@jpizagno
Copy link
Contributor

Hello @morazow
In short, is the Exasol max_precision limit 36 or 38?

I am testing the Exasol Types conversion. The class DecimalTape.scala lists "val MAX_PRECISION = 38". The Exasol PDF Manual on Page 95 Section 2.3.2 Numeric data types, lists "precision <= 36".
When I run a test with DECIMAL(38,38), then I get a stacktrace:

java.sql.SQLException: illegal precision value: 38 [line 8, column 31] (Session: 1615596311647547145)
[info] at com.exasol.jdbc.ExceptionFactory.createSQLException(ExceptionFactory.java:175)
[info] at com.exasol.jdbc.EXASQLException.getSQLExceptionIntern(EXASQLException.java:50)
[info] at com.exasol.jdbc.AbstractEXAStatement.execute(AbstractEXAStatement.java:468)
[info] at com.exasol.jdbc.EXAStatement.execute(EXAStatement.java:278)

@morazow
Copy link
Contributor Author

morazow commented Oct 29, 2018

Good morning @jpizagno,

That is good catch!

In my initial implementation I did not check these types in depth. Could you please also update the max_precision to 36 ?

@jpizagno
Copy link
Contributor

Hey @morazow and @3cham
I checked again, and the the MAX_PRECISION=38 comes from DecimalType in spark-catalyst_2.11-2.3.1. So it seems that Spark has MAX=38 whereas Exasol-JDBC just has a MAX=36.
The problem is that this number has nothing to do with spark-exasol-connector.

I would not advise that we put a hard-coded 36 in spark-exaol-connector, but I do not see another option. Where is MAX=36 accessible from Exasol-JDBC?

So I can see that the Exasol JDBC 6.0.8 has precision=36 in BigDecimalColumn.class, but there is no way to access that number.

If MAX=36 is not accessible from the Exasol-JDBC, then should we hard-code 36 in spark-exasol-connector/Types.scala?
( ;< ugly solution)

If there is no way to access this number/limit from the JDBC, then should we open a ticket with Exasol to make a static class with these constants accessible ?

@jpizagno
Copy link
Contributor

@morazow and @3cham
Here is an example from my fork/branch where the MAX_PRECISION and MAX_SCALE are hard-coded in Types.scala. It works, but is dangerous. What are your thoughts?

https://github.com/jpizagno/spark-exasol-connector/blob/d2de8a928342f1022f1d5a5080fdddc00b9b5426/src/main/scala/com/exasol/spark/util/Types.scala#L12

@3cham
Copy link
Contributor

3cham commented Oct 29, 2018

IMO, the precision and scale are returned from the ResultSet metadata, so that the value could never exceed 36 (see: here).

I think there is no need to change the MAX_PRECISION & MAX_SCALE in the boundedDecimal method, those come from spark DataType are fine already. However, in your implementation getMaxPrecision() and getMaxScale() should return the value from EXASOL.

@jpizagno
Copy link
Contributor

@3cham
Good point. I guess nobody can create a type in Exasol with PRECISION >36, so it may not be a relevant test. I will leave the getter methods there (change the name to conform to getter naming standards), adjust tests, and carry on.

@morazow
Let me know what your thoughts are?

@morazow
Copy link
Contributor Author

morazow commented Oct 30, 2018

Hey @jpizagno, @3cham

The changes so far looks good from my side.

We can bound the precision and scale by 36 as Exasol standard since they will never be larger than this value.

I think this is okay for reading / querying from Exasol into Spark. However, it is good to know that Spark max value is 38. This might be relevant when we have save or insert into Exasol from Spark dataframe.

@jpizagno
Copy link
Contributor

@morazow
Great. I just added a test that expects an Exception to be thrown when a Type in Exasol is not converted to Spark.

I will wait for the pull-request on sub-connections to go through, will rebase my fork, and then submit a pull request, and then we can review it.

@morazow
Copy link
Contributor Author

morazow commented Oct 30, 2018

@jpizagno great!

I have merged the other pr.

@morazow
Copy link
Contributor Author

morazow commented Nov 5, 2018

Solved at #26!

Test coverage is up to 87% 🎉

@morazow morazow closed this as completed Nov 5, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants