SQLFlow is a compiler that compiles a SQL program to a workflow that runs on Kubernetes.

SQLFlow

CI codecov GoDoc License Go Report Card

What is SQLFlow

SQLFlow is a compiler that compiles a SQL program to a workflow that runs on Kubernetes. The input is a SQL program that written in our extended SQL grammar to support AI jobs including training, prediction, model evaluation, model explanation, custom jobs, and mathematical programming. The output is an Argo workflow that runs on a Kubernetes cluster distributed.

SQLFlow supports various database systems like MySQL, MariaDB, TiDB, Hive, MaxCompute and many machine learning toolkits like TensorFlow, Keras, XGBoost.

Try SQLFlow NOW in our playground https://playground.sqlflow.tech/ and check out the handy tutorials in it.

Motivation

The current experience of development ML based applications requires a team of data engineers, data scientists, business analysts as well as a proliferation of advanced languages and programming tools like Python, SQL, SAS, SASS, Julia, R. The fragmentation of tooling and development environment brings additional difficulties in engineering to model training/tuning. What if we marry the most widely used data management/processing language SQL with ML/system capabilities and let engineers with SQL skills develop advanced ML based applications?

There are already some work in progress in the industry. We can write simple machine learning prediction (or scoring) algorithms in SQL using operators like DOT_PRODUCT. However, this requires copy-n-pasting model parameters from the training program to SQL statements. In the commercial world, we see some proprietary SQL engines providing extensions to support machine learning capabilities.

  • Microsoft SQL Server: Microsoft SQL Server has the machine learning service that runs machine learning programs in R or Python as an external script.
  • Teradata SQL for DL: Teradata also provides a RESTful service, which is callable from the extended SQL SELECT syntax.
  • Google BigQuery: Google BigQuery enables machine learning in SQL by introducing the CREATE MODEL statement.

None of the existing solution solves our pain point, instead we want it to be fully extensible.

  1. This solution should be compatible to many SQL engines, instead of a specific version or type.
  2. It should support sophisticated machine learning models, including TensorFlow for deep learning and XGBoost for trees.
  3. We also want the flexibility to configure and run cutting-edge ML algorithms including specifying feature crosses, at least, no Python or R code embedded in the SQL statements, and fully integrated with hyperparameter estimation.

Quick Overview

Here are examples for training a TensorFlow DNNClassifier model using sample data Iris.train, and running prediction using the trained model. You can see how cool it is to write some elegant ML code using SQL:

sqlflow> SELECT *
FROM iris.train
TO TRAIN DNNClassifier
WITH model.n_classes = 3, model.hidden_units = [10, 20]
COLUMN sepal_length, sepal_width, petal_length, petal_width
LABEL class
INTO sqlflow_models.my_dnn_model;

...
Training set accuracy: 0.96721
Done training
sqlflow> SELECT *
FROM iris.test
TO PREDICT iris.predict.class
USING sqlflow_models.my_dnn_model;

...
Done predicting. Predict table : iris.predict

How to use SQLFlow

Contributing Guidelines

Roadmap

SQLFlow will love to support as many mainstream ML frameworks and data sources as possible, but we feel like the expansion would be hard to be done merely on our own, so we would love to hear your options on what ML frameworks and data sources you are currently using and build upon. Please refer to our roadmap for specific timelines, also let us know your current scenarios and interests around SQLFlow project so we can prioritize based on the feedback from the community.

Feedback

Your feedback is our motivation to move on. Please let us know your questions, concerns, and issues by filing GitHub Issues.

License

Apache License 2.0

Published

Owner
SQLFlow
We are hiring at San Francisco Bay Area, Hangzhou, Shanghai, Beijing
SQLFlow
Comments
  • [Proposal] Design proposal for adding graph data to SQLFlow database

    [Proposal] Design proposal for adding graph data to SQLFlow database

    Is your feature request related to a problem? Please describe. Hi, I'm thinking about bringing a new type of training data (graph data) to the SQLFlow database. This is significant since many real world data are non-euclidean such as graphs, and people who uses SQLFlow may encounter such data in their tasks. Deep learning models such as GCN and GAT are powerful to solve graph related problems, and they would be helpful if we include them in the library in the future. However, before we bring these models to the SQLFlow, it would be convenient to have a pre-load graph dataset such as cora in the SQLFlow database so that these models can be trained and tested easily.

    This is a rough idea, and the following are some my thoughts on the solutions. It would be good if we could discuss it a bit and any suggestions are appreciated!

    Describe the solution you'd like

    Part I. Database schema

    If we want to use graph related DL models to solve real world problems, there are two things that need to be provided: features, which are the information contained within each node in the graph, and adjacency matrix which represents the graph structure in the format of a matrix (this could be calculated by an edge list). features should be a 2-D tensor with shape (N,D) where N is the number of nodes and D is the dimension of each node's feature vector. adjacency matrix is a 2-D sparse tensor with shape (N,N).

    Thus, I'm considering to have two tables in the database, which would be enough to maintain all the information we need in a graph.

    • Node Table: store the information of each node in one table.
    Node Table
    
    id  | name | features | label
    -----------------------------------------
    105 | node1 | "0 0 1" or [0, 0, 1] | "L1"
    106 | node2 | "0 1 0" or [0, 1, 0] | "L2"
    

    The features may be in the form of arrays or vectors, and I guess storing them with type TEXT or JSON would be efficient.

    • Edge Table: store the graph structures in the form of edges in one table.
    Edge Table
    
    id | from_node_id | to_node_id | weight
    ---------------------------------------
    1  | 105          | 106        | 1.0
    2  | 106          | 105        | 2.5
    

    From my perspective, these two tables are efficient and powerful to handle most of the graph data. If you find some corner cases that make this design vulnerable, please comment below.

    Part II. Loading data

    (I'm not familiar with how SQLFlow pass data into Python, so I skip the process of getting data from the two tables above.) The difficult part of loading data is to build the adjacency matrix of the graph. Here are two solutions that I find to be good:

    • Scipy: Use the Scipy package to get the adjacency matrix. If we have a edge vector (or list of lists) edges with shape (E, 2) where E is the number of edges, we could build the adjacency matrix using following python script:
    import numpy as np
    import scipy as sp
    # coo_matrix((data, (i, j)), [shape=(M, N)])
    adjacency = sp.coo_matrix((np.ones(len(edges)),
                        (edges[:, 0], edges[:, 1])),
                        shape=(features.shape[0], features.shape[0]), dtype="int64")
    

    features.shape[0] is the number of nodes (N) and the adjacency matrix adjacency has shape (N, N). The adjacency matrix is a Scipy sparse matrix in the format COO (COO is a fast format for constructing sparse matrices).

    • NetworkX: Use the NetworkX package to generate a graph, and then build the adjacency matrix automatically. This process can be done with following python script:
    import networkx as nx
    import scipy as sp
    # use cora dataset as an example, and we are creating undirected graph. Use nx.DiGraph() for directed graph.
    g = nx.Graph(edges) # edges must be list of lists, and each sublist (A,B) represents the edge between node A and B.
    adjacency = nx.adjacency_matrix(G)
    

    adjacency is a Scipy sparse matrix, we could convert it to the COO format by adjacency.tocoo() which will get the adjacency matrix same as above. NetworkX is a brilliant python library that can do graph and network analysis, and it has many other useful tools we may use in the future development. We could decide which method we should apply for loading the data from the database.

    Additional Notes Here are some details about the dataset that I would love to add to the database:

    • cora dataset: The Cora dataset is about a citation network of scientific papers. It consists of 2708 scientific publications classified into one of seven classes. The citation network consists of 5429 links. Each publication in the dataset is described by a 0/1-valued word vector indicating the absence/presence of the corresponding word from the dictionary. The dictionary consists of 1433 unique words.

    If there is anything that you find valuable or needs to be improved, please let me know.Thanks!

  • Rethink syntax extension

    Rethink syntax extension

    During our efforts to deploy SQLFlow in some real cases, we tried to extend the syntax of some SQL dialects, including MySQL and Hive. In these efforts, we learned an art:

    Try reusing existing SQL reserved words in the extended syntax.

    Why this art? Let us look at an example in https://github.com/sql-machine-learning/sqlflow/issues/473. Because TRAIN isn't a reserved word, users could name a field by TRAIN, and this might confuse our parser. The solution is to either replace TRAIN by a reserved word, where we couldn't find one that expresses the meaning of "train", or, extend TRAIN into TO TRAIN, where, please be aware that TO is a reserved word.

    It seems that other clause extensions after TO TRAIN, TO PREDICT, and TO EXPLAIN can use arbitrary words, as they are not going to be parsed by a SQL (dialect) parser, but the SQLFlow parser. However, it is not that simple. Consider that a table might have a field named label, and this field happens to be a label when we train a model. The SQL statement would look like

    SELECT a, b, label FROM tbl
    TO TRAIN Model
    LABEL label
    

    This would confuse the parser. Our current workaround is this. However, a complete solutions seems that we don't use OUTPUT to replace LABEL. OUTPUT is a SQL keyword, so users are not supposed to name a field by OUTPUT. (This might not be true always, but, at least, it seems that the probability of OUTPUT output is smaller than that of LABEL label.)

    Please vote for the following syntax changes:

  • Start to jupyter iris-dnn examples, then encounter this problem

    Start to jupyter iris-dnn examples, then encounter this problem

    Description When I start to open a web browser, open iris-dnn.ipynb file and run the code... %%sqlflow describe iris.train;

    But I encounter this problem: /miniconda/envs/sqlflow-dev/lib/python3.6/site-packages/grpc/_channel.py in _next(self) 363 raise StopIteration() 364 else: --> 365 raise self 366 367 def _response_ready():

    _Rendezvous: <_Rendezvous of RPC that terminated with: status = StatusCode.UNAVAILABLE details = "failed to connect to all addresses" debug_error_string = "{"created":"@1572500632.931119638","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3876,"referenced_errors":[{"created":"@1572500571.009073496","description":"failed to connect to all addresses","file":"src/core/ext/filters/client_channel/lb_policy/pick_first/pick_first.cc","file_line":395,"grpc_status":14}]}"

    Who can help me? Thanks reaaaally much!

  • Switch the connection parameters while sqlflow is running

    Switch the connection parameters while sqlflow is running

    Is your feature request related to a problem? Please describe. In the production environment, we can't specify the data source connection parameters when SQLFlow starts. Instead, we know the connection parameters only after the user logs on. Therefore, we should support this connection parameter change after the startup is complete.

  • Proposal: Managing external & internal parsers

    Proposal: Managing external & internal parsers

    Problem

    SQLFlow calls Java HiveQL/ODPS parser to help to parse the SQL program. As shown in the following dependency graph, each parser is wrapped as a gRPC server which takes a string and returns the parsed result.

    test3

    However, some parsers like ODPS parser can't be open-sourced. This leads to the circular dependencies between the internal code and open-sourced code, i.e. the open-sourced Java gRPC server needs to call ODPS parser while the close-sourced ODPS parser needs to return ParseResult.

    test

    Solution

    We remove the dependencies from gRPC server to ODPS parser via dynamic loading. To be specific, the open-sourced code creates ODPS parser instance by the following.

    Object parser = Class.forName("org.sqlflow.parser.internal.odpsParser....").newInstance();  
    ((ParserInterface)parser).parse(sql);  
    

    By doing so, we only have a one-way dependency from internal repo to the GitHub repo.

    test2

    Implementation Details

    GitHub repo:

    1. Each parser should be an extension of a ParserInterface.
    interface ParseInterface {
    ParseResult parse(String sql);
    }
    

    Internal repo:

    1. The building process starts from building the GitHub repo into a .jar file then add it to the Maven project.
    2. The deployment of the internal .jar file should be the same as the open-sourced .jar file, since they share the same entry point.

    cc @typhoonzero @weiguoz

  • SQLFlow Product Roadmap 2019

    SQLFlow Product Roadmap 2019

    I am trying to summarize the milestones as follows. @sql-machine-learning/sqlflow team please comment.

    Engineering time: 6 fulltime months could mean 3 people spend 2 months full time working on the project.

    | Release | ETA | Features | Engineering Time | |---------|-------|--------------------------------------------------------|---------------------| | Alpha | 04/20 | ~~Open source SQLFlow Core~~ | 10+ fulltime months | | | 05/20 | ~~Support customized model with Keras~~ | 1+ fulltime months | | | 05/20 | ~~Open source Go Hive driver~~ | 3+ fulltime months | | | 06/20 | ~~Open source Go MaxCompute driver~~ | 3+ fulltime months | | 0.1.0 | 07/01 | Hive: local training & prediction | 2+ fulltime months | | | 07/20 | Support distributed training & prediction | 3+ fulltime months | | 0.2.0 | 08/01 | MaxCompute: local training & predicting | 3+ fulltime months | | 0.3.0 | 10/15 | Elastic scheduling training & prediction | 3+ fulltime months | | 0.4.0 | 11/20 | SQLFlow cloud release on AliCloud | 2+ fulltime months | | | | | | | | ? | Support third party submitter | | | | ? | Support GPU or TensorFlow-GPU | | | | ? | ML on image/audio/video | | | | ? | Calcite parser | | | | ? | Open source gosparksql | | | | ? | SparkSQL as data source: local training & predicting | | | | ? | SQL Server as data source: local training & predicting | |

  • Make SQLFlow parser work together with Apache Calcite

    Make SQLFlow parser work together with Apache Calcite

    Currently, we use sql/sql.y, which, compiles into sql/parser.go, to parse a SQL statement. This parser has some limitations:

    1. It understands our extended SQL syntax for the SELECT statement, for example,

      SELECT * FROM a_table TRAIN DNNClassifier INTO a_model;
      

      but cannot understand any other SQL statements, for example, USE my_database;

    2. It understands the very basic syntax of standard SELECT, for example, it doesn't allow us to write a nested SELECT with the TRAIN and INTO suffix.

    To make a parser that understands SQL completely and the extended SELECT statement, I am considering a hybrid of Calcite and our parser. The following example might help to explain how the hybrid parser works.

    1. Throw the above example statementSELECT ... TRAIN ... or even a more complex one like SELECT ... FROM (SELECT ...) TRAIN ..., to Calcite parser. The Calcite parser should error because it doesn't understand TRAIN. If it happens that the error message contains information about the location of TRAIN in the input string, then we can split the input string at the location, and the left part is the SELECT or nested-SELECT, and the right part consists of TRAIN ....

    2. Throw the left part to Calcite parser again. This time the parser should be OK with it.

    3. Throw the right part to our parser (modified to ignore the SELECT ... prefix), and it should be able to parse the TRAIN clause and provides information for SQLFlow code generator to create a submitter.py, which takes the left part that passed the syntax check of Calcite. The submitter then sends the left part to the SQL engine (MySQL or Hive) and reads the output for training or prediction.

  • Determine the ordering of records for a SELECT statement in MySQL

    Determine the ordering of records for a SELECT statement in MySQL

  • Potential issue when running tests concurrently

    Potential issue when running tests concurrently

    Currently main_test.go runs end2end tests by first populating the data to database. This would normally work well for local DB inside docker container. However, for tests related to ODPS/Maxcompute, we are using a real database using SQL in iris_sql.go which looks like the following:

    DROP TABLE IF EXISTS gomaxcompute_driver_w7u.sqlflow_test_iris_train;
    CREATE TABLE gomaxcompute_driver_w7u.sqlflow_test_iris_train
    ...
    

    Since the table name is fixed, if the multiple test runs are executed concurrently, there's a chance that either the table is being dropped or being created, which might fail other test runs that rely on this particular table. I've seen this happen several times when developing locally or testing on Travis CI.

  • Data sharding problem in distributed XGBoost training using rabit

    Data sharding problem in distributed XGBoost training using rabit

    I am not sure how XGBoost distributed training using rabit shards input data. Supposing that the whole dataset is D, there are 2 usually methods to shard input data in distributed training:

    • Method 1: we shard dataset D beforehand into N pieces, where N is the number of workers. Then, we distribute each piece of dataset to each worker. In this way, each worker reads unique data.
    • Method 2: all workers can read the whole dataset D, and the each worker only pick up 1/N data inside the whole dataset D, say, what tf.data.Dataset.shard does.

    In the doc of XGBoost and rabit, I have not found whether method 1 or 2 is used. But in the implementation of XGBoost, I found that maybe XGBoost uses method 2.

    Link to my issue of XGBoost repo: https://github.com/dmlc/xgboost/issues/5694

  • File '/tmp/sqlflow227199809/input.sql' cannot be read

    File '/tmp/sqlflow227199809/input.sql' cannot be read

    Description It seems that SQLFlow cannot read temporary file saved on server.

    Reproduction Steps After restart the kernel of notebook, the following exception will be thrown out.

    _Rendezvous: <_Rendezvous of RPC that terminated with:
    	status = StatusCode.UNKNOWN
    	details = "thirdPartyParse failed: java.io.IOException: File '/tmp/sqlflow227199809/input.sql' cannot be read
    	at org.apache.commons.io.FileUtils.openInputStream(FileUtils.java:294)
    	at org.apache.commons.io.FileUtils.readFileToByteArray(FileUtils.java:1851)
    	at org.sqlflow.parser.ParserAdaptorCmd.main(ParserAdaptorCmd.java:42)
     exit status 255"
    	debug_error_string = "{"created":"@1574340457.693353044","description":"Error received from peer ipv4:10.82.128.7:8005","file":"src/core/lib/surface/call.cc","file_line":1055,"grpc_message":"thirdPartyParse failed: java.io.IOException: File '/tmp/sqlflow227199809/input.sql' cannot be read\n\tat org.apache.commons.io.FileUtils.openInputStream(FileUtils.java:294)\n\tat org.apache.commons.io.FileUtils.readFileToByteArray(FileUtils.java:1851)\n\tat org.sqlflow.parser.ParserAdaptorCmd.main(ParserAdaptorCmd.java:42)\n exit status 255","grpc_status":2}"
    >
    

    Expected Behavior What you expected to happen.

    Screenshots

    Environment (Please complete the following information):

    • OS:
    • Browser:
    • Version:

    Additional Notes

  • CVE-2007-4559 Patch

    CVE-2007-4559 Patch

    Patching CVE-2007-4559

    Hi, we are security researchers from the Advanced Research Center at Trellix. We have began a campaign to patch a widespread bug named CVE-2007-4559. CVE-2007-4559 is a 15 year old bug in the Python tarfile package. By using extract() or extractall() on a tarfile object without sanitizing input, a maliciously crafted .tar file could perform a directory path traversal attack. We found at least one unsantized extractall() in your codebase and are providing a patch for you via pull request. The patch essentially checks to see if all tarfile members will be extracted safely and throws an exception otherwise. We encourage you to use this patch or your own solution to secure against CVE-2007-4559. Further technical information about the vulnerability can be found in this blog.

    If you have further questions you may contact us through this projects lead researcher Kasimir Schulz.

  • Add SECURITY.md

    Add SECURITY.md

    Hello 👋

    I run a security community that finds and fixes vulnerabilities in OSS. A researcher (@whokilleddb) has found a potential issue, which I would be eager to share with you.

    Could you add a SECURITY.md file with an e-mail address for me to send further details to? GitHub recommends a security policy to ensure issues are responsibly disclosed, and it would help direct researchers in the future.

    Looking forward to hearing from you 👍

    (cc @huntr-helper)

  • The link provided by the documentation is unable to recogniz

    The link provided by the documentation is unable to recogniz

    document URL : https://github.com/sql-machine-learning/sqlflow/blob/develop/doc/run/workflow_mode.md step: Setup Kubernetes and Argo

    kubectl apply -n argo -f https://raw.githubusercontent.com/argoproj/argo/stable/manifests/install.yaml
    

    this URL unable to recognize,Need to update to

    kubectl apply -n argo -f https://raw.githubusercontent.com/argoproj/argo-workflows/v3.0.3/manifests/install.yaml
    
  • run activepower clustering demo in tutorial recieve ' AttributeError: 'EmbeddingColumn' object has no attribute 'key''

    run activepower clustering demo in tutorial recieve ' AttributeError: 'EmbeddingColumn' object has no attribute 'key''

    Description run activepower clustering demo in tutorial recieve ' AttributeError: 'EmbeddingColumn' object has no attribute 'key''

    demo in tutorial activepower_clustering

    I set mysql database locally with sql script

    so I have the same data with demo.

    then run: %%sqlflow SELECT * FROM sql_flow_test.activepower_train TO TRAIN sqlflow_models.DeepEmbeddingClusterModel WITH model.n_clusters=3, model.pretrain_epochs=10, model.train_max_iters=800, model.train_lr=0.01, model.pretrain_lr=1, train.batch_size=256 COLUMN m1,m2,m3,m4,m5,m6,m7,m8,m9,m10,m11,m12,m13,m14,m15,m16,m17,m18,m19,m20,m21,m22,m23,m24,m25,m26,m27,m28,m29,m30,m31,m32,m33,m34,m35,m36,m37,m38,m39,m40,m41,m42,m43,m44,m45,m46,m47,m48 INTO sqlflow_models.my_activepower_train_model;

    run activepower clustering demo in tutorial recieve ' AttributeError: 'EmbeddingColumn' object has no attribute 'key''

    image

    ==== `2021/05/20 10:02:13 SQLFlow Step Execute:

    SELECT * FROM sql_flow_test.activepower_train

    TO TRAIN sqlflow_models.DeepEmbeddingClusterModel

    WITH

    model.n_clusters=3,

    model.pretrain_epochs=10,

    model.train_max_iters=800,

    model.train_lr=0.01,

    model.pretrain_lr=1,

    train.batch_size=256

    COLUMN m1,m2,m3,m4,m5,m6,m7,m8,m9,m10,m11,m12,m13,m14,m15,m16,m17,m18,m19,m20,m21,m22,m23,m24,m25,m26,m27,m28,m29,m30,m31,m32,m33,m34,m35,m36,m37,m38,m39,m40,m41,m42,m43,m44,m45,m46,m47,m48

    INTO sqlflow_models.my_activepower_train_model;

    Start training using keras model...

    2021-05-20 10:02:38.341380 Start pre_train.

    2021-05-20 10:02:38.341461 Start preparing training dataset to save into memory.

    message:<message:"runSQLProgram error: failed: exit status 1\n==========Generated Code:==========\n# -- coding: utf-8 --\nimport copy\nimport traceback\nimport tensorflow as tf\nimport runtime\nfrom runtime.tensorflow.train import train\nfrom runtime.tensorflow.get_tf_version import tf_is_version2\nfrom tensorflow.estimator import (DNNClassifier,\n DNNRegressor,\n LinearClassifier,\n LinearRegressor,\n BoostedTreesClassifier,\n BoostedTreesRegressor,\n DNNLinearCombinedClassifier,\n DNNLinearCombinedRegressor)\nif tf_is_version2():\n from tensorflow.keras.optimizers import Adadelta, Adagrad, Adam, Adamax, Ftrl, Nadam, RMSprop, SGD\n from tensorflow.keras.losses import BinaryCrossentropy, CategoricalCrossentropy, CategoricalHinge, CosineSimilarity, Hinge, Huber, KLDivergence, LogCosh, MeanAbsoluteError, MeanAbsolutePercentageError, MeanSquaredError, MeanSquaredLogarithmicError, Poisson, SparseCategoricalCrossentropy, SquaredHinge\nelse:\n from tensorflow.train import AdadeltaOptimizer, AdagradOptimizer, AdamOptimizer, FtrlOptimizer, RMSPropOptimizer, GradientDescentOptimizer, MomentumOptimizer\n from tensorflow.keras.losses import BinaryCrossentropy, CategoricalCrossentropy, CategoricalHinge, CosineSimilarity, Hinge, Huber, KLDivergence, LogCosh, MeanAbsoluteError, MeanAbsolutePercentageError, MeanSquaredError, MeanSquaredLogarithmicError, Poisson, SparseCategoricalCrossentropy, SquaredHinge\ntry:\n import sqlflow_models\nexcept Exception as e:\n print("failed to import sqlflow_models: %s", e)\n traceback.print_exc()\n\nfeature_column_names = [\n"dates",\n\n"m1",\n\n"m2",\n\n"m3",\n\n"m4",\n\n"m5",\n\n"m6",\n\n"m7",\n\n"m8",\n\n"m9",\n\n"m10",\n\n"m11",\n\n"m12",\n\n"m13",\n\n"m14",\n\n"m15",\n\n"m16",\n\n"m17",\n\n"m18",\n\n"m19",\n\n"m20",\n\n"m21",\n\n"m22",\n\n"m23",\n\n"m24",\n\n"m25",\n\n"m26",\n\n"m27",\n\n"m28",\n\n"m29",\n\n"m30",\n\n"m31",\n\n"m32",\n\n"m33",\n\n"m34",\n\n"m35",\n\n"m36",\n\n"m37",\n\n"m38",\n\n"m39",\n\n"m40",\n\n"m41",\n\n"m42",\n\n"m43",\n\n"m44",\n\n"m45",\n\n"m46",\n\n"m47",\n\n"m48",\n\n"class",\n]\n\n# feature_column_names_map is used to determine the order of feature columns of each target:\n# e.g. when using DNNLinearCombinedClassifer.\n# feature_column_names_map will be saved to a single file when using PAI.\nfeature_column_names_map = dict()\n\nfeature_column_names_map["feature_columns"] = ["dates","m1","m2","m3","m4","m5","m6","m7","m8","m9","m10","m11","m12","m13","m14","m15","m16","m17","m18","m19","m20","m21","m22","m23","m24","m25","m26","m27","m28","m29","m30","m31","m32","m33","m34","m35","m36","m37","m38","m39","m40","m41","m42","m43","m44","m45","m46","m47","m48","class",]\n\n\nfeature_metas = dict()\n\n\nfeature_metas["dates"] = {\n "feature_name": "dates",\n "dtype": "string",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m1"] = {\n "feature_name": "m1",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m2"] = {\n "feature_name": "m2",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m3"] = {\n "feature_name": "m3",\n "dtype": "float32",\n "delimiter": "",\n "format": ""

    ,\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m4"] = {\n "feature_name": "m4",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m5"] = {\n "feature_name": "m5",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m6"] = {\n "feature_name": "m6",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m7"] = {\n "feature_name": "m7",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m8"] = {\n "feature_name": "m8",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m9"] = {\n "feature_name": "m9",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m10"] = {\n "feature_name": "m10",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m11"] = {\n "feature_name": "m11",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m12"] = {\n "feature_name": "m12",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m13"] = {\n "feature_name": "m13",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m14"] = {\n "feature_name": "m14",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m15"] = {\n "feature_name": "m15",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m16"] = {\n "feature_name": "m16",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m17"] = {\n "feature_name": "m17",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m18"] = {\n "feature_name": "m18",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "

    dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m19"] = {\n "feature_name": "m19",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m20"] = {\n "feature_name": "m20",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m21"] = {\n "feature_name": "m21",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m22"] = {\n "feature_name": "m22",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m23"] = {\n "feature_name": "m23",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m24"] = {\n "feature_name": "m24",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m25"] = {\n "feature_name": "m25",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m26"] = {\n "feature_name": "m26",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m27"] = {\n "feature_name": "m27",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m28"] = {\n "feature_name": "m28",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m29"] = {\n "feature_name": "m29",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m30"] = {\n "feature_name": "m30",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m31"] = {\n "feature_name": "m31",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m32"] = {\n "feature_name": "m32",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m33"] = {\n "feature_name": "m33",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\n

    feature_metas["m34"] = {\n "feature_name": "m34",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m35"] = {\n "feature_name": "m35",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m36"] = {\n "feature_name": "m36",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m37"] = {\n "feature_name": "m37",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m38"] = {\n "feature_name": "m38",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m39"] = {\n "feature_name": "m39",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m40"] = {\n "feature_name": "m40",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m41"] = {\n "feature_name": "m41",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m42"] = {\n "feature_name": "m42",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m43"] = {\n "feature_name": "m43",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m44"] = {\n "feature_name": "m44",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m45"] = {\n "feature_name": "m45",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m46"] = {\n "feature_name": "m46",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m47"] = {\n "feature_name": "m47",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["m48"] = {\n "feature_name": "m48",\n "dtype": "float32",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\nfeature_metas["class"] = {\n "feature_name": "class"

    ,\n "dtype": "int64",\n "delimiter": "",\n "format": "",\n "shape": [1],\n "is_sparse": "false" == "true",\n "dtype_weight": "int64",\n "delimiter_kv": ""\n}\n\n\n\nlabel_meta = {\n "feature_name": "",\n "dtype": "int64",\n "delimiter": "",\n "shape": [1],\n "is_sparse": "false" == "true"\n}\n\nmodel_params=dict()\n\nmodel_params["n_clusters"]=3\n\nmodel_params["pretrain_epochs"]=10\n\nmodel_params["pretrain_lr"]=1\n\nmodel_params["train_lr"]=0.010000\n\nmodel_params["train_max_iters"]=800\n\n\n# Construct optimizer objects to pass to model initializer.\n# The original model_params is serializable (do not have tf.xxx objects).\nmodel_params_constructed = copy.deepcopy(model_params)\nfor optimizer_arg in ["optimizer", "dnn_optimizer", "linear_optimizer"]:\n if optimizer_arg in model_params_constructed:\n model_params_constructed[optimizer_arg] = eval(model_params_constructed[optimizer_arg])\n\nif "loss" in model_params_constructed:\n model_params_constructed["loss"] = eval(model_params_constructed["loss"])\n\n# feature_columns_code will be used to save the training informations together\n# with the saved model.\nfeature_columns_code = """{"feature_columns": [tf.feature_column.embedding_column(tf.feature_column.categorical_column_with_vocabulary_list(key="dates", vocabulary_list=["2/26","3/5","3/17","3/23","4/10","5/5","1/13","1/20","2/24","4/24","6/23","1/6","1/9","6/3","2/10","3/16","2/18","4/11","4/22","5/3","6/22","1/10","1/30","6/4","6/11","6/17","6/26","1/28","5/20","5/21","2/11","3/1","3/8","4/26","5/6","5/9","1/21","2/4","5/17","4/21","4/25","5/29","6/27","2/23","3/3","4/18","4/23","5/12","2/2","3/14","3/22","4/2","4/20","5/8","5/23","5/25","1/26","1/27","6/1","4/13","1/22","3/13","4/6","4/19","2/7","3/27","4/28","6/2","6/8","1/4","1/25","3/9","3/12","3/21","4/5","6/6","2/15","3/6","2/13","3/29","2/8","2/12","2/28","3/2","3/31","4/16","5/14","1/8","1/17","5/7","5/24","2/21","2/25","3/18","4/3","6/7","6/10","6/29","2/16","3/7","6/14","6/16","6/24","1/2","3/20","4/12","5/31","2/5","4/1","5/22","6/9","6/15","6/19","6/25","6/30","1/16","3/24","3/30","6/28","1/15","1/23","1/31","2/17","3/25","4/7","1/7","1/14","5/30","6/12","3/11","3/28","5/10","1/1","1/3","2/27","3/19","4/14","4/30","5/28","6/18","1/12","1/18","6/20","4/9","5/4","5/11","5/13","5/16","1/29","2/1","2/9","2/20","5/18","5/19","1/5","1/24","5/26","6/21","3/15","5/15","4/8","4/15","5/27","1/11","4/4","2/19","3/4","3/26","4/27","4/29","6/13","1/19","2/6","3/10","4/17","5/2","2/14","2/22","6/5","2/3","5/1"]), dimension=128, combiner="sum"),\ntf.feature_column.numeric_column("m1", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m2", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m3", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m4", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m5", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m6", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m7", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m8", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m9", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m10", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m11", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m12", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m13", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_c

    olumn("m14", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m15", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m16", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m17", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m18", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m19", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m20", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m21", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m22", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m23", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m24", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m25", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m26", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m27", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m28", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m29", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m30", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m31", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m32", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m33", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m34", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m35", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m36", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m37", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m38", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m39", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m40", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m41", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m42", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m43", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m44", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m45", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m46", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m47", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("m48", shape=[1], dtype=tf.dtypes.float32),\ntf.feature_column.numeric_column("class", shape=[1], dtype=tf.dtypes.int64)]}"""\nfeature_columns = eval(feature_columns_code)\n\ntrain_max_steps = 0\ntrain_max_steps = None if train_max_steps == 0 else train_max_steps\n\ntrain(datasource="mysql://yuepf:yuepf123456@tcp(192.195.253.130:3306)/?maxAllowedPacket=0",\n estimator_string="""sqlflow_models.DeepEmbeddingClusterModel""",\n select="""\nSELECT * FROM sql_flow_test.activepower_train\n""",\n validation_select="""""",\n feature_columns=feature_columns,\n feature_column_names=feature_column_names,\n feature_metas=feature_metas,\n label_meta=label_meta,\n model_params=model_params_constructed,\n validation_metrics="Accuracy".split(","),\n save="model_save",\n batch_size=256,\n epoch=1,\n validation_steps=1,\n verbose=0,\n max_steps=train_max_steps,\n validation_start_delay_secs=0,\n validation_throttle_secs=0,\n save_checkpoints_steps=100,\n log_every_n_iter=10,\n load_pretrained_model="false" == "true",\n is_pai="false" == "true",\n pai_table="",\n pai_val_table="",\n feature_columns_code=feature_columns_code,\n model_params_code_map=model_params,\n model_repo_image="",\n or

    iginal_sql='''\nSELECT * FROM sql_flow_test.activepower_train\nTO TRAIN sqlflow_models.DeepEmbeddingClusterModel\nWITH\n model.n_clusters=3,\n model.pretrain_epochs=10,\n model.train_max_iters=800,\n model.train_lr=0.01,\n model.pretrain_lr=1,\n train.batch_size=256\nCOLUMN m1,m2,m3,m4,m5,m6,m7,m8,m9,m10,m11,m12,m13,m14,m15,m16,m17,m18,m19,m20,m21,m22,m23,m24,m25,m26,m27,m28,m29,m30,m31,m32,m33,m34,m35,m36,m37,m38,m39,m40,m41,m42,m43,m44,m45,m46,m47,m48\nINTO sqlflow_models.my_activepower_train_model;\n''',\n feature_column_names_map=feature_column_names_map)\n\n==========Output==========\nTraceback (most recent call last):\n File "", line 823, in \n File "/opt/sqlflow/python/runtime/tensorflow/train.py", line 116, in train\n load_pretrained_model, model_meta, is_pai)\n File "/opt/sqlflow/python/runtime/tensorflow/train_keras.py", line 144, in keras_train_and_save_legacy\n validation_steps, has_none_optimizer)\n File "/opt/sqlflow/python/runtime/tensorflow/train_keras.py", line 155, in keras_train_compiled\n classifier.sqlflow_train_loop(train_dataset)\n File "/usr/lib/python3.6/site-packages/sqlflow_models/deep_embedding_cluster.py", line 246, in sqlflow_train_loop\n self.pre_train(x)\n File "/usr/lib/python3.6/site-packages/sqlflow_models/deep_embedding_cluster.py", line 153, in pre_train\n y = x.cache().map(map_func=_concate_generate)\n File "/usr/lib/python3.6/site-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 1900, in map\n MapDataset(self, map_func, preserve_cardinality=False))\n File "/usr/lib/python3.6/site-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 3416, in init\n use_legacy_function=use_legacy_function)\n File "/usr/lib/python3.6/site-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 2695, in init\n self._function = wrapper_fn._get_concrete_function_internal()\n File "/usr/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 1854, in _get_concrete_function_internal\n *args, **kwargs)\n File "/usr/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 1848, in _get_concrete_function_internal_garbage_collected\n graph_function, _, _ = self._maybe_define_function(args, kwargs)\n File "/usr/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 2150, in _maybe_define_function\n graph_function = self._create_graph_function(args, kwargs)\n File "/usr/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 2041, in _create_graph_function\n capture_by_value=self._capture_by_value),\n File "/usr/lib/python3.6/site-packages/tensorflow_core/python/framework/func_graph.py", line 915, in func_graph_from_py_func\n func_outputs = python_func(*func_args, **func_kwargs)\n File "/usr/lib/python3.6/site-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 2689, in wrapper_fn\n ret = _wrapper_helper(*args)\n File "/usr/lib/python3.6/site-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 2634, in _wrapper_helper\n ret = autograph.tf_convert(func, ag_ctx)(*nested_args)\n File "/usr/lib/python3.6/site-packages/tensorflow_core/python/autograph/impl/api.py", line 237, in wrapper\n raise e.ag_error_metadata.to_exception(e)\nAttributeError: in converted code:\n\n /usr/lib/python3.6/site-packages/sqlflow_models/deep_embedding_cluster.py:150 _concate_generate *\n concate_y = tf.stack([dataset_element[feature.key] for feature in self._feature_columns], axis=1)\n\n AttributeError: 'EmbeddingColumn' object has no attribute 'key'\n\n" >

    workflow step failed: runSQLProgram error: failed: exit status 1

    ==========Generated Code:==========

    -- coding: utf-8 --

    import copy

    import traceback

    import tensorflow as tf

    import runtime

    from runtime.tensorflow.train import train

    from runtime.tensorflow.get_tf_version import tf_is_version2

    from tensorflow.estimator import (DNNClassifier,

                                  DNNRegressor,
    
                                  LinearClassifier,
    
                                  LinearRegressor,
    
                                  BoostedTreesClassifier,
    
                                  BoostedTreesRegressor,
    
                                  DNNLinearCombinedClassifier,
    
                                  DNNLinearCombinedRegressor)
    

    if tf_is_version2():

    from tensorflow.keras.optimizers import Adadelta, Adagrad, Adam, Adamax, Ftrl, Nadam, RMSprop, SGD
    
    from tensorflow.keras.losses import BinaryCrossentropy, CategoricalCrossentropy, CategoricalHinge, CosineSimilarity, Hinge, Huber, KLDivergence, LogCosh, MeanAbsoluteError, MeanAbsolutePercentageError, MeanSquaredError, MeanSquaredLogarithmicError, Poisson, SparseCategoricalCrossentropy, SquaredHinge
    

    else:

    from tensorflow.train import AdadeltaOptimizer, AdagradOptimizer, AdamOptimizer, FtrlOptimizer, RMSPropOptimizer, GradientDescentOptimizer, MomentumOptimizer
    
    from tensorflow.keras.losses import BinaryCrossentropy, CategoricalCrossentropy, CategoricalHinge, CosineSimilarity, Hinge, Huber, KLDivergence, LogCosh, MeanAbsoluteError, MeanAbsolutePercentageError, MeanSquaredError, MeanSquaredLogarithmicError, Poisson, SparseCategoricalCrossentropy, SquaredHinge
    

    try:

    import sqlflow_models
    

    except Exception as e:

    print("failed to import sqlflow_models: %s", e)
    
    traceback.print_exc()
    

    feature_column_names = [

    "dates",

    "m1",

    "m2",

    "m3",

    "m4",

    "m5",

    "m6",

    "m7",

    "m8",

    "m9",

    "m10",

    "m11",

    "m12",

    "m13",

    "m14",

    "m15",

    "m16",

    "m17",

    "m18",

    "m19",

    "m20",

    "m21",

    "m22",

    "m23",

    "m24",

    "m25",

    "m26",

    "m27",

    "m28",

    "m29",

    "m30",

    "m31",

    "m32",

    "m33",

    "m34",

    "m35",

    "m36",

    "m37",

    "m38",

    "m39",

    "m40",

    "m41",

    "m42",

    "m43",

    "m44",

    "m45",

    "m46",

    "m47",

    "m48",

    "class",

    ]

    feature_column_names_map is used to determine the order of feature columns of each target:

    e.g. when using DNNLinearCombinedClassifer.

    feature_column_names_map will be saved to a single file when using PAI.

    feature_column_names_map = dict()

    feature_column_names_map["feature_columns"] = ["dates","m1","m2","m3","m4","m5","m6","m7","m8","m9","m10","m11","m12","m13","m14","m15","m16","m17","m18","m19","m20","m21","m22","m23","m24","m25","m26","m27","m28","m29","m30","m31","m32","m33","m34","m35","m36","m37","m38","m39","m40","m41","m42","m43","m44","m45","m46","m47","m48","class",]

    feature_metas = dict()

    feature_metas["dates"] = {

    "feature_name": "dates",
    
    "dtype": "string",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m1"] = {

    "feature_name": "m1",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m2"] = {

    "feature_name": "m2",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m3"] = {

    "feature_name": "m3",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m4"] = {

    "feature_name": "m4",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m5"] = {

    "feature_name": "m5",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m6"] = {

    "feature_name": "m6",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m7"] = {

    "feature_name": "m7",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m8"] = {

    "feature_name": "m8",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m9"] = {

    "feature_name": "m9",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m10"] = {

    "feature_name": "m10",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m11"] = {

    "feature_name": "m11",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m12"] = {

    "feature_name": "m12",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m13"] = {

    "feature_name": "m13",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m14"] = {

    "feature_name": "m14",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m15"] = {

    "feature_name": "m15",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m16"] = {

    "feature_name": "m16",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m17"] = {

    "feature_name": "m17",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m18"] = {

    "feature_name": "m18",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m19"] = {

    "feature_name": "m19",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m20"] = {

    "feature_name": "m20",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m21"] = {

    "feature_name": "m21",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m22"] = {

    "feature_name": "m22",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m23"] = {

    "feature_name": "m23",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m24"] = {

    "feature_name": "m24",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m25"] = {

    "feature_name": "m25",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m26"] = {

    "feature_name": "m26",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m27"] = {

    "feature_name": "m27",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m28"] = {

    "feature_name": "m28",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m29"] = {

    "feature_name": "m29",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m30"] = {

    "feature_name": "m30",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m31"] = {

    "feature_name": "m31",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m32"] = {

    "feature_name": "m32",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m33"] = {

    "feature_name": "m33",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m34"] = {

    "feature_name": "m34",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m35"] = {

    "feature_name": "m35",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m36"] = {

    "feature_name": "m36",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m37"] = {

    "feature_name": "m37",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m38"] = {

    "feature_name": "m38",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m39"] = {

    "feature_name": "m39",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m40"] = {

    "feature_name": "m40",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m41"] = {

    "feature_name": "m41",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m42"] = {

    "feature_name": "m42",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m43"] = {

    "feature_name": "m43",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m44"] = {

    "feature_name": "m44",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m45"] = {

    "feature_name": "m45",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m46"] = {

    "feature_name": "m46",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m47"] = {

    "feature_name": "m47",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["m48"] = {

    "feature_name": "m48",
    
    "dtype": "float32",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    feature_metas["class"] = {

    "feature_name": "class",
    
    "dtype": "int64",
    
    "delimiter": "",
    
    "format": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true",
    
    "dtype_weight": "int64",
    
    "delimiter_kv": ""
    

    }

    label_meta = {

    "feature_name": "",
    
    "dtype": "int64",
    
    "delimiter": "",
    
    "shape": [1],
    
    "is_sparse": "false" == "true"
    

    }

    model_params=dict()

    model_params["n_clusters"]=3

    model_params["pretrain_epochs"]=10

    model_params["pretrain_lr"]=1

    model_params["train_lr"]=0.010000

    model_params["train_max_iters"]=800

    Construct optimizer objects to pass to model initializer.

    The original model_params is serializable (do not have tf.xxx objects).

    model_params_constructed = copy.deepcopy(model_params)

    for optimizer_arg in ["optimizer", "dnn_optimizer", "linear_optimizer"]:

    if optimizer_arg in model_params_constructed:
    
        model_params_constructed[optimizer_arg] = eval(model_params_constructed[optimizer_arg])
    

    if "loss" in model_params_constructed:

    model_params_constructed["loss"] = eval(model_params_constructed["loss"])
    

    feature_columns_code will be used to save the training informations together

    with the saved model.

    feature_columns_code = """{"feature_columns": [tf.feature_column.embedding_column(tf.feature_column.categorical_column_with_vocabulary_list(key="dates", vocabulary_list=["2/26","3/5","3/17","3/23","4/10","5/5","1/13","1/20","2/24","4/24","6/23","1/6","1/9","6/3","2/10","3/16","2/18","4/11","4/22","5/3","6/22","1/10","1/30","6/4","6/11","6/17","6/26","1/28","5/20","5/21","2/11","3/1","3/8","4/26","5/6","5/9","1/21","2/4","5/17","4/21","4/25","5/29","6/27","2/23","3/3","4/18","4/23","5/12","2/2","3/14","3/22","4/2","4/20","5/8","5/23","5/25","1/26","1/27","6/1","4/13","1/22","3/13","4/6","4/19","2/7","3/27","4/28","6/2","6/8","1/4","1/25","3/9","3/12","3/21","4/5","6/6","2/15","3/6","2/13","3/29","2/8","2/12","2/28","3/2","3/31","4/16","5/14","1/8","1/17","5/7","5/24","2/21","2/25","3/18","4/3","6/7","6/10","6/29","2/16","3/7","6/14","6/16","6/24","1/2","3/20","4/12","5/31","2/5","4/1","5/22","6/9","6/15","6/19","6/25","6/30","1/16","3/24","3/30","6/28","1/15","1/23","1/31","2/17","3/25","4/7","1/7","1/14","5/30","6/12","3/11","3/28","5/10","1/1","1/3","2/27","3/19","4/14","4/30","5/28","6/18","1/12","1/18","6/20","4/9","5/4","5/11","5/13","5/16","1/29","2/1","2/9","2/20","5/18","5/19","1/5","1/24","5/26","6/21","3/15","5/15","4/8","4/15","5/27","1/11","4/4","2/19","3/4","3/26","4/27","4/29","6/13","1/19","2/6","3/10","4/17","5/2","2/14","2/22","6/5","2/3","5/1"]), dimension=128, combiner="sum"),

    tf.feature_column.numeric_column("m1", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m2", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m3", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m4", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m5", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m6", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m7", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m8", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m9", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m10", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m11", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m12", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m13", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m14", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m15", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m16", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m17", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m18", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m19", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m20", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m21", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m22", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m23", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m24", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m25", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m26", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m27", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m28", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m29", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m30", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m31", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m32", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m33", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m34", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m35", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m36", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m37", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m38", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m39", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m40", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m41", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m42", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m43", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m44", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m45", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m46", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m47", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("m48", shape=[1], dtype=tf.dtypes.float32),

    tf.feature_column.numeric_column("class", shape=[1], dtype=tf.dtypes.int64)]}"""

    feature_columns = eval(feature_columns_code)

    train_max_steps = 0

    train_max_steps = None if train_max_steps == 0 else train_max_steps

    train(datasource="mysql://yuepf:yuepf123456@tcp(192.195.253.130:3306)/?maxAllowedPacket=0",

      estimator_string="""sqlflow_models.DeepEmbeddingClusterModel""",
    
      select="""
    

    SELECT * FROM sql_flow_test.activepower_train

    """,

      validation_select="""""",
    
      feature_columns=feature_columns,
    
      feature_column_names=feature_column_names,
    
      feature_metas=feature_metas,
    
      label_meta=label_meta,
    
      model_params=model_params_constructed,
    
      validation_metrics="Accuracy".split(","),
    
      save="model_save",
    
      batch_size=256,
    
      epoch=1,
    
      validation_steps=1,
    
      verbose=0,
    
      max_steps=train_max_steps,
    
      validation_start_delay_secs=0,
    
      validation_throttle_secs=0,
    
      save_checkpoints_steps=100,
    
      log_every_n_iter=10,
    
      load_pretrained_model="false" == "true",
    
      is_pai="false" == "true",
    
      pai_table="",
    
      pai_val_table="",
    
      feature_columns_code=feature_columns_code,
    
      model_params_code_map=model_params,
    
      model_repo_image="",
    
      original_sql='''
    

    SELECT * FROM sql_flow_test.activepower_train

    TO TRAIN sqlflow_models.DeepEmbeddingClusterModel

    WITH

    model.n_clusters=3,

    model.pretrain_epochs=10,

    model.train_max_iters=800,

    model.train_lr=0.01,

    model.pretrain_lr=1,

    train.batch_size=256

    COLUMN m1,m2,m3,m4,m5,m6,m7,m8,m9,m10,m11,m12,m13,m14,m15,m16,m17,m18,m19,m20,m21,m22,m23,m24,m25,m26,m27,m28,m29,m30,m31,m32,m33,m34,m35,m36,m37,m38,m39,m40,m41,m42,m43,m44,m45,m46,m47,m48

    INTO sqlflow_models.my_activepower_train_model;

    ''',

      feature_column_names_map=feature_column_names_map)
    

    ==========Output==========

    Traceback (most recent call last):

    File "", line 823, in

    File "/opt/sqlflow/python/runtime/tensorflow/train.py", line 116, in train

    load_pretrained_model, model_meta, is_pai)
    

    File "/opt/sqlflow/python/runtime/tensorflow/train_keras.py", line 144, in keras_train_and_save_legacy

    validation_steps, has_none_optimizer)
    

    File "/opt/sqlflow/python/runtime/tensorflow/train_keras.py", line 155, in keras_train_compiled

    classifier.sqlflow_train_loop(train_dataset)
    

    File "/usr/lib/python3.6/site-packages/sqlflow_models/deep_embedding_cluster.py", line 246, in sqlflow_train_loop

    self.pre_train(x)
    

    File "/usr/lib/python3.6/site-packages/sqlflow_models/deep_embedding_cluster.py", line 153, in pre_train

    y = x.cache().map(map_func=_concate_generate)
    

    File "/usr/lib/python3.6/site-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 1900, in map

    MapDataset(self, map_func, preserve_cardinality=False))
    

    File "/usr/lib/python3.6/site-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 3416, in init

    use_legacy_function=use_legacy_function)
    

    File "/usr/lib/python3.6/site-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 2695, in init

    self._function = wrapper_fn._get_concrete_function_internal()
    

    File "/usr/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 1854, in _get_concrete_function_internal

    *args, **kwargs)
    

    File "/usr/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 1848, in _get_concrete_function_internal_garbage_collected

    graph_function, _, _ = self._maybe_define_function(args, kwargs)
    

    File "/usr/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 2150, in _maybe_define_function

    graph_function = self._create_graph_function(args, kwargs)
    

    File "/usr/lib/python3.6/site-packages/tensorflow_core/python/eager/function.py", line 2041, in _create_graph_function

    capture_by_value=self._capture_by_value),
    

    File "/usr/lib/python3.6/site-packages/tensorflow_core/python/framework/func_graph.py", line 915, in func_graph_from_py_func

    func_outputs = python_func(*func_args, **func_kwargs)
    

    File "/usr/lib/python3.6/site-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 2689, in wrapper_fn

    ret = _wrapper_helper(*args)
    

    File "/usr/lib/python3.6/site-packages/tensorflow_core/python/data/ops/dataset_ops.py", line 2634, in _wrapper_helper

    ret = autograph.tf_convert(func, ag_ctx)(*nested_args)
    

    File "/usr/lib/python3.6/site-packages/tensorflow_core/python/autograph/impl/api.py", line 237, in wrapper

    raise e.ag_error_metadata.to_exception(e)
    

    AttributeError: in converted code:

    /usr/lib/python3.6/site-packages/sqlflow_models/deep_embedding_cluster.py:150 _concate_generate  *
    
        concate_y = tf.stack([dataset_element[feature.key] for feature in self._feature_columns], axis=1)
    
    AttributeError: 'EmbeddingColumn' object has no attribute 'key'`
    

    Environment (Please complete the following information):

    • OS:Windos 10
    • Browser:Chrome
    • docker destop
    • Docker Engine : v20.10.6
    • Kubernetes: v1.19.7
  • SQLFLow integrate with Flink

    SQLFLow integrate with Flink

    Will SQLFlow integrate with Flink in the future: Pre-process data with Flink and perform TensorFlow deep learning with SQL If the community has such a plan, our team can participate in contributing code Our team is very involved in the Flink community and is interested in AI-related integration

write APIs using direct SQL queries with no hassle, let's rethink about SQL

SQLer SQL-er is a tiny portable server enables you to write APIs using SQL query to be executed when anyone hits it, also it enables you to define val

Jan 7, 2023
Parses a file and associate SQL queries to a map. Useful for separating SQL from code logic

goyesql This package is based on nleof/goyesql but is not compatible with it any more. This package introduces support for arbitrary tag types and cha

Oct 20, 2021
Go-sql-reader - Go utility to read the externalised sql with predefined tags

go-sql-reader go utility to read the externalised sql with predefined tags Usage

Jan 25, 2022
Manage SQL databases, users and grant using kubernetes manifests

SqlOperator Operate sql databases, users and grants. This is a WIP project and should not at all be used in production at this time. Feel free to vali

Nov 28, 2021
GORM SQLChaos manipulates DML at program runtime based on gorm

GORM SQLChaos GORM SQLChaos manipulates DML at program runtime based on gorm callbacks Motivation In Financial Business distributed system, account im

Oct 21, 2022
A tool I made to quickly store bug bounty program scopes in a local sqlite3 database

GoScope A tool I made to quickly store bug bounty program scopes in a local sqlite3 database. Download or copy a Burpsuite configuration file from the

Nov 18, 2021
Go package for sharding databases ( Supports every ORM or raw SQL )
Go package for sharding databases ( Supports every ORM or raw SQL )

Octillery Octillery is a Go package for sharding databases. It can use with every OR Mapping library ( xorm , gorp , gorm , dbr ...) implementing data

Dec 16, 2022
Prep finds all SQL statements in a Go package and instruments db connection with prepared statements

Prep Prep finds all SQL statements in a Go package and instruments db connection with prepared statements. It allows you to benefit from the prepared

Dec 10, 2022
pggen - generate type safe Go methods from Postgres SQL queries

pggen - generate type safe Go methods from Postgres SQL queries pggen is a tool that generates Go code to provide a typesafe wrapper around Postgres q

Jan 3, 2023
🐳 A most popular sql audit platform for mysql
🐳 A most popular sql audit platform for mysql

?? A most popular sql audit platform for mysql

Jan 6, 2023
sqlx is a library which provides a set of extensions on go's standard database/sql library

sqlx is a library which provides a set of extensions on go's standard database/sql library. The sqlx versions of sql.DB, sql.TX, sql.Stmt, et al. all leave the underlying interfaces untouched, so that their interfaces are a superset on the standard ones. This makes it relatively painless to integrate existing codebases using database/sql with sqlx.

Jan 7, 2023
A tool to run queries in defined frequency and expose the count as prometheus metrics. Supports MongoDB and SQL
A tool to run queries in defined frequency and expose the count as prometheus metrics. Supports MongoDB and SQL

query2metric A tool to run db queries in defined frequency and expose the count as prometheus metrics. Why ? Product metrics play an important role in

Jul 1, 2022
Dumpling is a fast, easy-to-use tool written by Go for dumping data from the database(MySQL, TiDB...) to local/cloud(S3, GCP...) in multifarious formats(SQL, CSV...).

?? Dumpling Dumpling is a tool and a Go library for creating SQL dump from a MySQL-compatible database. It is intended to replace mysqldump and mydump

Nov 9, 2022
Universal command-line interface for SQL databases

usql A universal command-line interface for PostgreSQL, MySQL, Oracle Database, SQLite3, Microsoft SQL Server, and many other databases including NoSQ

Jan 9, 2023
auto generate sql from gorm model struct

gorm2sql: auto generate sql from gorm model struct A Swiss Army Knife helps you generate sql from gorm model struct. Installation go get github.com/li

Dec 22, 2022
a golang library for sql builder

Gendry gendry is a Go library that helps you operate database. Based on go-sql-driver/mysql, it provides a series of simple but useful tools to prepar

Dec 26, 2022
Fluent SQL generation for golang

Squirrel is "complete". Bug fixes will still be merged (slowly). Bug reports are welcome, but I will not necessarily respond to them. If another fork

Dec 29, 2022
SQL Optimizer And Rewriter
SQL Optimizer And Rewriter

文档 | FAQ | 变更记录 | 路线图 | English SOAR SOAR(SQL Optimizer And Rewriter) 是一个对 SQL 进行优化和改写的自动化工具。 由小米人工智能与云平台的数据库团队开发与维护。 功能特点 跨平台支持(支持 Linux, Mac 环境,Wind

Jan 4, 2023
Additions to Go's database/sql for super fast performance and convenience.

gocraft/dbr (database records) gocraft/dbr provides additions to Go's database/sql for super fast performance and convenience. $ go get -u github.com/

Jan 1, 2023