As I continue my journey with learning Flink SQL one thing I wondered was how one would go about submitting a Flink SQL job in a Production scenario. It seems to me that the SQL Gateway's REST API would be a good candidate for this. You'd put the SQL code in a file under source control, and then use a deployment pipeline to submit that SQL against the endpoint. It’s worth noting that it’s recommended to use application mode when deploying production jobs to Flink—and the SQL Client and Gateway don’t support this yet. There is a FLIP under discussion but in the meantime if you want to use application mode you’d need to wrap your SQL in a JAR to deploy it. Either that, or continue to use SQL Client or Gateway and be aware of the limitations of running the job in session mode (basically, no resource isolation).
In this article I'll show you how to use the endpoint, including exploring it with the Postman tool, using HTTPie to call the endpoint from the shell, and finishing off with a viable proof-of-concept script to execute statements from a script.
The API is documented and has two versions, each with their own OpenAPI YAML spec. I'm going to look at <span class="inline-code">v2</span> here. Note that the docs say that the spec is still experimental.
To get started, let's bring up the Flink cluster and the SQL Gateway locally:
Postman is a handy tool for doing tons of useful stuff with APIs. Here I'm just using it to create the sample calls against the endpoints and to quickly understand how they relate to each other. You can use it on the web but assuming you're using a local instance of Flink (or at least, one that is not publicly accessible) then you'll want the desktop version. Note that you'll still need to sign up for a free account to access the Import feature that we're using here.
Under your Workspace click Import and paste in the URL of the SQL Gateway's OpenAPI YAML file (<span class="inline-code">https://nightlies.apache.org/flink/flink-docs-master/generated/rest_v2_sql_gateway.yml</span>) which is linked to on the docs page. Import it as a Postman Collection.
You'll now see under the Postman collection a list of all the API endpoints, with pre-created calls for each. Go to Environments -> Globals and define <span class="inline-code">baseUrl</span> with the value for your SQL Gateway. If you're running it locally then this is going to be <span class="inline-code">http://localhost:8083</span>
Now go back to Collections and under the Flink SQL Gateway REST API folder find the <span class="inline-code">get Info</span> call. Open this up and hit Send. You should see a successful response like this:
You can also click the Code icon (<span class="inline-code"></></span> ) to see the call in various different languages and tools including cURL and HTTPie. For now this is not ground-breaking, but once you get onto payloads it's really handy.
Just as we manually populated the global variable <span class="inline-code">baseURL</span> above, we can also take the response from one call and use it in the making of another. This is really useful, because there are two variables that the REST API returns that we need to use (<span class="inline-code">sessionHandle</span> and <span class="inline-code">operationHandle</span>). To do this in Postman add the following to the Tests tab of the request pane:
var jsonData = JSON.parse(responseBody);
postman.setEnvironmentVariable("sessionHandle", jsonData.sessionHandle);
That assumes that the variable to populate is called <span class="inline-code">sessionHandle</span> and it's returned in the root key called <span class="inline-code">sessionHandle</span> of the response. Which it is:
Once you've set a variable, you can use it in other calls by referencing it in double curly braces, like this:
I’ve shared a copy of my Postman collection with the above variable configuration done for you here
Let's now go through the workflow of how we'd actually submit a SQL statement from scratch to the gateway.
Running a SQL Statement with the Flink SQL Gateway
In essence, the minimal steps are as follows. You can see the docs for more info.
Establish a Session (with optional configuration parameters set)
Submit a SQL Statement, which generates an Operation.
Check the status of the Operation until it's complete
Fetch the results of the Operation.
Here's how to do each one, using HTTPie as an example client and showing the response. I'm using bash variables to hold the values of session and operation handles.
Note here the <span class="inline-code">runtime-mode</span> has been set from the <span class="inline-code">properties</span> that were passed above in the session creation.
Because <span class="inline-code">resultType</span> is not <span class="inline-code">EOS</span> and there's a value for <span class="inline-code">nextResultUri</span> it tells us there's more to fetch - at the location specified in <span class="inline-code">nextResultUri</span>
You might also be interested to try our Decodable which provides a fully managed Apache Flink and Debezium service. Our CLI and API both support deployment of pipelines with Flink SQL.
📫 Email signup 👇
Did you enjoy this issue of Checkpoint Chronicle? Would you like the next edition delivered directly to your email to read from the comfort of your own home?
Simply enter your email address here and we'll send you the next issue as soon as it's published—and nothing else, we promise!
👍 Got it!
Oops! Something went wrong while submitting the form.
Robin Moffatt
Robin is a Principal DevEx Engineer at Decodable. He has been speaking at conferences since 2009 including QCon, Devoxx, Strata, Kafka Summit, and Øredev. You can find many of his talks online and his articles on the Decodable blog as well as his own blog.
Outside of work, Robin enjoys running, drinking good beer, and eating fried breakfasts—although generally not at the same time.
As I continue my journey with learning Flink SQL one thing I wondered was how one would go about submitting a Flink SQL job in a Production scenario. It seems to me that the SQL Gateway's REST API would be a good candidate for this. You'd put the SQL code in a file under source control, and then use a deployment pipeline to submit that SQL against the endpoint. It’s worth noting that it’s recommended to use application mode when deploying production jobs to Flink—and the SQL Client and Gateway don’t support this yet. There is a FLIP under discussion but in the meantime if you want to use application mode you’d need to wrap your SQL in a JAR to deploy it. Either that, or continue to use SQL Client or Gateway and be aware of the limitations of running the job in session mode (basically, no resource isolation).
In this article I'll show you how to use the endpoint, including exploring it with the Postman tool, using HTTPie to call the endpoint from the shell, and finishing off with a viable proof-of-concept script to execute statements from a script.
The API is documented and has two versions, each with their own OpenAPI YAML spec. I'm going to look at <span class="inline-code">v2</span> here. Note that the docs say that the spec is still experimental.
To get started, let's bring up the Flink cluster and the SQL Gateway locally:
Postman is a handy tool for doing tons of useful stuff with APIs. Here I'm just using it to create the sample calls against the endpoints and to quickly understand how they relate to each other. You can use it on the web but assuming you're using a local instance of Flink (or at least, one that is not publicly accessible) then you'll want the desktop version. Note that you'll still need to sign up for a free account to access the Import feature that we're using here.
Under your Workspace click Import and paste in the URL of the SQL Gateway's OpenAPI YAML file (<span class="inline-code">https://nightlies.apache.org/flink/flink-docs-master/generated/rest_v2_sql_gateway.yml</span>) which is linked to on the docs page. Import it as a Postman Collection.
You'll now see under the Postman collection a list of all the API endpoints, with pre-created calls for each. Go to Environments -> Globals and define <span class="inline-code">baseUrl</span> with the value for your SQL Gateway. If you're running it locally then this is going to be <span class="inline-code">http://localhost:8083</span>
Now go back to Collections and under the Flink SQL Gateway REST API folder find the <span class="inline-code">get Info</span> call. Open this up and hit Send. You should see a successful response like this:
You can also click the Code icon (<span class="inline-code"></></span> ) to see the call in various different languages and tools including cURL and HTTPie. For now this is not ground-breaking, but once you get onto payloads it's really handy.
Just as we manually populated the global variable <span class="inline-code">baseURL</span> above, we can also take the response from one call and use it in the making of another. This is really useful, because there are two variables that the REST API returns that we need to use (<span class="inline-code">sessionHandle</span> and <span class="inline-code">operationHandle</span>). To do this in Postman add the following to the Tests tab of the request pane:
var jsonData = JSON.parse(responseBody);
postman.setEnvironmentVariable("sessionHandle", jsonData.sessionHandle);
That assumes that the variable to populate is called <span class="inline-code">sessionHandle</span> and it's returned in the root key called <span class="inline-code">sessionHandle</span> of the response. Which it is:
Once you've set a variable, you can use it in other calls by referencing it in double curly braces, like this:
I’ve shared a copy of my Postman collection with the above variable configuration done for you here
Let's now go through the workflow of how we'd actually submit a SQL statement from scratch to the gateway.
Running a SQL Statement with the Flink SQL Gateway
In essence, the minimal steps are as follows. You can see the docs for more info.
Establish a Session (with optional configuration parameters set)
Submit a SQL Statement, which generates an Operation.
Check the status of the Operation until it's complete
Fetch the results of the Operation.
Here's how to do each one, using HTTPie as an example client and showing the response. I'm using bash variables to hold the values of session and operation handles.
Note here the <span class="inline-code">runtime-mode</span> has been set from the <span class="inline-code">properties</span> that were passed above in the session creation.
Because <span class="inline-code">resultType</span> is not <span class="inline-code">EOS</span> and there's a value for <span class="inline-code">nextResultUri</span> it tells us there's more to fetch - at the location specified in <span class="inline-code">nextResultUri</span>
You might also be interested to try our Decodable which provides a fully managed Apache Flink and Debezium service. Our CLI and API both support deployment of pipelines with Flink SQL.
📫 Email signup 👇
Did you enjoy this issue of Checkpoint Chronicle? Would you like the next edition delivered directly to your email to read from the comfort of your own home?
Simply enter your email address here and we'll send you the next issue as soon as it's published—and nothing else, we promise!
Robin Moffatt
Robin is a Principal DevEx Engineer at Decodable. He has been speaking at conferences since 2009 including QCon, Devoxx, Strata, Kafka Summit, and Øredev. You can find many of his talks online and his articles on the Decodable blog as well as his own blog.
Outside of work, Robin enjoys running, drinking good beer, and eating fried breakfasts—although generally not at the same time.