Releases · alibaba/Alink

26 Nov 10:13

xuyang1706

v1.5.1

8f27a7b

Alink version 1.5.1

Improve the performance of dl module.
Resolve many issues on Windows platform.
Add incremental training mode for LR, Softmax etc.
Improve the performance of graph-based random walk algorithms.

Assets 7

09 Oct 08:42

xuyang1706

v1.5.0

06937c6

Alink version 1.5.0

Add timeseries
○ Add prophet model. #176
○ Add AutoArima, Arima, HoltWinters, AutoGarch
○ Add LSTNet, DeepAR
Add deep learning module (Linux and MacOSX Intel chips)
Add MTable, Tensor
Add resource plugin
Improve usage of PyFlink in PyAlink, close #178

Assets 7

11 Jun 17:44

xuyang1706

v1.4.0

21412ea

Alink version 1.4.0

Adapt flink 1.13.
Fixed some bugs
Add some feature engineering methods
Refine the documents of BatchOp/StreamOp
Add java demoes

Assets 7

24 Feb 05:21

xuyang1706

v1.3.2

18f2860

Alink version 1.3.2

Release note:

SLF4J load error when run Java example #109
one hot encode a little optimization #112
Add quoting in mysql column name #159
fix some error (decimal exception & partition set invalid) #162
Fix partition overwrite in hive.
Upgrade the flink version from 1.12.0 to 1.12.1

Assets 6

07 Jan 11:49

xuyang1706

v1.3.1

8efd044

Alink version 1.3.1

Adapt flink 1.12.
Add plugin of kafka.
Add s3 file system.
Add odps catalog.
Fix poisson and add glm model info.
Add multi-files in pipeline loader and local predictor loader.
Use legacy serializer to compatible with old ak format.
Change vector type as CompositeType and Change Sparse vector as pojo type.
Remove the REGEXP_REPLACE in sql selector for flink 1.12

Assets 6

30 Nov 08:11

xuyang1706

v1.3.0

1cf6749

Alink version 1.3.0

Add more model info batch op and support print model info in pipeline model.
Add recommendation module.
- Supported recommender are:
  - Als
  - Factorization Machines
  - ItemCF
  - UserCF
- Supported others function for recommendation module are:
  - Leave k-object out
  - Leave top k-object out
  - Ranking evaluation
  - Multi-Label evaluation
Add online learning algorithoms.
- ftrl model filter
Add a series of similarity algorithms.
- VectorNearestNeighbor
- TextSimilarity
- TextNearestNeighbor
- TextApproxNearestNeighbor
- StringSimilarity
- StringNearestNeighbor
- StringApproxNearestNeighbor
Add DocWordCountBatchOp，KeywordsExtractionBatchOp， TfidfBatchOp，WordCountBatchOp
Add KNN
Add GeoKMeans, Streaming Kmeans
Add model selctor algorithms.
- RandomSearchCV
- RandomSearchTVSplit
Add plugin in filesystem and catalog. Add catalogs of hive, mysql, derby and sqlite
PyAlink:
- Align with new functionalities in Java side, including new operators, catalog, plugin mechanism, and so on;
- For Flink version 1.9, PyAlink now depends on PyFlink directly, resulting in supporting flink run, and table-related operetions.
Fix some issues, optimize performance and add more parameters in linear and tree model
Add test utils module and optimize performance of unit tests.
Remove the db module.
Refine the save/load in pipeline and pipeline model. Use Ak as the default format for save/load.
Support load LocalPredictor from Ak file which saved on filesystem. This will avoid collect when load the LocalPredictor. see #78 #79
Add multi-threads in all mapper
Optimize memory usage of batch prediction.
Add pseudoInverse in matrix
Support that the sparse vector has not size
Fix sequencing issue when linkFrom the model info batch op
Optimize the format of lazy print.
Add Stopwatch and TimeSpan
Add serialVersionUID in all serializable classes.

Assets 5

31 Jul 08:58

xuyang1706

v1.2.0

dab1e16

Alink version 1.2.0

Adapt for Flink 1.11
- Flink API calls (#129), Hive connectors (#130) and kafka connector(#129) are adapted for Flink 1.11.
- Adjust FilePath of FileSystem for Flink 1.11 #131
Add Factorization Machines classification and regression #115
Support Lazy APIs for higher user interactivity and richer information.
Lazy APIs enable intermediate outputs of the ML pipeline to be printed, collected, and post-processed along with the mainstream of data process. Such intermediate outputs include: ML model and training information, evaluation metrics, data statistics, etc.
- PyAlink supported
- Support Lazy APIs for BatchOperators and related methods in EstimatorBase/TransformerBase #116
- Add model information:
  - Linear model #118 #132
  - Tree model #125
  - PCA #117
  - ChisqSelector #117
  - VectorChisqSelector #117
  - KMeans #120
  - BisectingKMeans #120
  - NaiveBayes #122
  - Lda #122
  - GaussianMixture #120
  - OneHotEncoder #120
  - QuantileDiscretizer #120
  - MinMaxScaler #122
  - VectorMinMaxScaler #122
  - MaxAbsScaler #122
  - VectorMaxAbsScaler #122
  - StandardScaler #122
  - VectorStandardScaler #122
- Add training information:
  - word2vec #125
- Add statistics:
  - Correlation #117
  - Summary #117
- Add EvaluationMetrics #124
Add FileSystem APIs. #126
Using FileSystem APIs, users can process files on different file systems with unified and friendly experience. Such processing can be exists, isDir, list, read, write or other commonly functions used for files. Supported file system are:
- HDFS
- OSS
- Local
Add Ak source/sink and Csv source/sink support new FileSystem APIs. #126
Ak is a file format storing data together with its schema that can be written to filesystem. It makes the advantages of compressed, tabular data representation.The supported APIs are shown in the table below:

HDFS OSS Local

Ak source ✔️ ✔️ ✔️

Ak sink ✔️ ✔️ ✔️

Csv source ✔️ ✔️ ✔️

Csv sink ✔️ ✔️ ✔️
Support EqualWidthDiscretizer. #123
Feature Enhancements and API unification in Clustering. #121
Refine code of QuantileDiscretizer and OneHotEncoder #111
Fix predict stream op in alspredictstreamop.md #104

	HDFS	OSS	Local
Ak source	✔️	✔️	✔️
Ak sink	✔️	✔️	✔️
Csv source	✔️	✔️	✔️
Csv sink	✔️	✔️	✔️

Assets 5

08 Jun 15:21

xuyang1706

v1.1.2

dcdd78b

Alink version 1.1.2

Add transformers among formats Vector, CSV, Json, KV, Columns and Triple #93
• Support AnyToAny transformation
• Unified transformation params and easy use.
Support SQL select statements in the Pipeline and LocalPredictor #61
• Support flink planner built-in functions regarding individual rows: comparison, logical, arithmetic, string, temporal, conditional, type conversion, hash, etc.
• Add alink_shaded/shaded_protobuf_java to support usage of native Calcite.
Support Hive source and sink #96
• Support Batch/Stream source&sink of Hive.
• Support partition of table.
• Simplify the dependence of Hive jar.
• Support multiple versions: 2.0, 2.1, 2.2, 2.3, 3.0
Fix PyAlink starting and UDF issues on Windows #76, #77
Support BigInteger type in MySql source #86
Add open and close in mapper. #92
Add open function in SegmentMapper and StopwordsRemoverMapper #94
Unify HandleInvalid Params #95

Assets 4

20 Apr 13:01

xuyang1706

v1.1.1

f2013aa

Alink version 1.1.1

Enhancements & New Features

Optimize conversion between operator and dataframe
Auto-detect localIp when useRemoveEnv
Add enum type parameter #65
• Adapt enum type params in quantile, distance and decision tree. #67
• linear model train params change to enum #71
• Kafka, StringIndexer and Join add enum parameters #72
• Adapt enum type params in pca, chi square test, glm and correlation. #73
streamop window group by #68
Add operators to parse strings in CSV, JSON and KV formats to columns #70
Tokenizer supports string split with multiple spaces #69
Make error message clear when selected columns are not found #66
Add an FTRL example #64

Fix & Refinements

Fix dill version conflict
ALSExample error #33
Bug of HasVectorSize alias #56
mysqlsource error when i use collect method #45

Assets 4

28 Feb 13:20

xuyang1706

v1.1.0

9e2a716

Alink version 1.1.0

Enhancements & New Features

Improvement of UDF/UDTF operators, Java and PyAlink have consistent usage and behaviors. #32 #44.
Publish to maven central and PyPI.
Support Flink 1.10 and Flink 1.9. #46
- https://github.com/alibaba/Alink/releases/tag/v1.1.0-flink-1.10
- https://github.com/alibaba/Alink/releases/tag/v1.1.0-flink-1.9
Support more Kafka connectors. #41.

API change

Modify Naive Bayes algorithm as a text classifier. #47
Modify and enhance the parameter, model in QuantileDiscretizer, OneHotEncoder and Bucketizer. #48

Documentation

Update data links in docs and codes. #28
Update PyAlink install instructions. #8

Fix & Refinements

Fix the problem in LDA online method and refine comments in FeatureLabelUtil. #29
Fit the bug that initial data of KMeansAssignCluster is not cleared. #31
Fix the int overflow bug in reading large csv file, and dd test cases for CsvFileInputSplit. See #27
Cleanup some code. #15
Remove a redundant test case whose data source is unaccessible. see #28
Fix the NEP in PCA. see #42

PyPI support

Support PyAlink installation using pip install pyalink

Maven Dependencies

Alink is now synchronized to the Maven central repository, which you can easily add to Maven projects.

With Flink-1.10

<dependency>
    <groupId>com.alibaba.alink</groupId>
    <artifactId>alink_core_flink-1.10_2.11</artifactId>
    <version>1.1.0</version>
</dependency>
<dependency>
    <groupId>org.apache.flink</groupId>
    <artifactId>flink-streaming-scala_2.11</artifactId>
    <version>1.10.0</version>
</dependency>
<dependency>
    <groupId>org.apache.flink</groupId>
    <artifactId>flink-table-planner_2.11</artifactId>
    <version>1.10.0</version>
</dependency>

With Flink-1.9

<dependency>
    <groupId>com.alibaba.alink</groupId>
    <artifactId>alink_core_flink-1.9_2.11</artifactId>
    <version>1.1.0</version>
</dependency>
<dependency>
    <groupId>org.apache.flink</groupId>
    <artifactId>flink-streaming-scala_2.11</artifactId>
    <version>1.9.0</version>
</dependency>
<dependency>
    <groupId>org.apache.flink</groupId>
    <artifactId>flink-table-planner_2.11</artifactId>
    <version>1.9.0</version>
</dependency>

Assets 4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhancements & New Features

Fix & Refinements

Enhancements & New Features

API change

Documentation

Fix & Refinements

PyPI support

Maven Dependencies

With Flink-1.10

With Flink-1.9

Releases: alibaba/Alink

Alink version 1.5.1

Alink version 1.5.0

Alink version 1.4.0

Alink version 1.3.2

Alink version 1.3.1

Alink version 1.3.0

Alink version 1.2.0

Alink version 1.1.2

Alink version 1.1.1

Enhancements & New Features

Fix & Refinements

Alink version 1.1.0

Enhancements & New Features

API change

Documentation

Fix & Refinements

PyPI support

Maven Dependencies

With Flink-1.10

With Flink-1.9