What's different about python in Exaptive?


Exaptive is definitely a bit of a paradigm shift from typical python programming. The advantage, once you get through it, is more modular and reusable python code. This article covers the concepts and some practicalities for making the shift. You'll want to have the Python API reference handy when you start to code. 

Think in the Dataflow:
In Exaptive, everything moves through a dataflow. Data passes through wires, hits components, and code is subsequently run. This code can be JS, R, or python. Each component is made up of a single script, and the inputs will call related methods inside the script. This should feel familiar once you receive an input the python is running. However, interfacing with the inputs and outputs can be a bit tricky (more on this later.)

Data Model:
As a rule of thumb, try to adhere to the Exaptive data model any time you're sending an output. This ensures that the data you send through an output port will play nicely with interfaces of other components. Later in this article, I'll cover typical approaches to casting your data into the Exaptive data model. See the Exaptive data model documentation for more information unrelated to python.

Infrastructure Considerations:
There are also some hardware limitations that are imposed on the environment that your script will run in. For instance, you only have access to a single core, so libraries that allow you to multithread your code are out (multiprocessing, threading.) Also, because of the nature of the infrastructure, basic file i/o is not a good option for persistence.

Runtime:
We also impose a timeout limitation that doesn't exist outside of Exaptive. This timeout at the time of this writing is 5 minutes. Your best bet will be to break up the query or algorithm into chunks and perform one chunk at a time to avoid lengthy runtimes.

Printing/debugging:

`print` will print to the component log, which is not returned to the user. Instead, try using self.api.log, which will send values to the exaptive log. (Tools -> log) The one major caveat is that in a normal system, these logs would be returned while the script is running. In Exaptive, this is not true. The script must finish executing, and then all of the logs will be returned at once, in order.


What's the same in and out of Exaptive?

Once you're inside the component, very little is different. You import modules, can define functions, classes, and methods. From function to function you can use whatever data model you'd like (pandas, numpy, scipy matrices.)

Python Version

As of this writing the only supported python version in Exaptive is 2.7. Check version release notes in your Studio or contact support to see if that has changed.



Installing Dependencies


There are three types of dependencies available:


To install dependencies available through pip you'll start in the component editor, then head to the Spec. 



Notice there is a key value pair for dependencies. This is where you'll indicate your packages for installation.


 

Here I've added dependencies of each type. For each dependency type there is a key value pair with the value being an array of objects with a path key indicating the desired package or file. As you might guess, apt and pip correspond to the package managers of the same name. File corresponds to Exaptive assets, which can be created in the Studio and referenced using the UUID associated as shown above. 


When you've added all of the necessary dependencies click Save, this will initiate the construction of a Docker container with the given dependencies. 


Caution: Installation is performed sequentially, so list your dependencies accordingly. 


Note: By default, the latest version of the package is installed. You can specify version as a sibling property to path.



Writing code


From inputs to methods


When an input is activated by data being sent into that port, the component will attempt to call a method defined in the script by the same name. That is, if your input port is named `my_data`, then the component will attempt to call a method named my_data. If this bothers you and you want to have a method called my_data_handler instead, you could define the functions as follows.

 

def my_data(state):
  my_data_handler(state)

def my_data_handler(state):
  """
  your function goes here
  """

 


If you use pandas or another data-frame manipulator


Feel free to use pandas for your operations within the component. But when you're sending data out of an output port, you'll need to turn your data frame into the exaptive data model. More on this later. 



The Exaptive Python Data API


The full JS data API is available here. No python companion is available just yet, but most of the calls can be inferred by translating from JS best practice like camelCase into the more pythonic variable_naming_scheme. However, I'll list a few of the most common API calls here.

1. Getting the input port state. self here corresponds to the main argument to your method. You can change that to state, or component if you'd like, but it must match the argument name of the method.
self.api.inputstate.export()
 
This gives you a dictionary with keys corresponding to the input names.

Example usage:

 

def query(self):
  state = self.api.inputstate.export()
  host = state["host"]
  query = state["query"]
  # More code goes here.
   
 

 

2. Sending output data

 

self.api.output("output_name", my_output_variable)

 

This sends my_output_variable out of the output named "output_name".


Example usage:

 

def addTwoNumbers(self):
  state = self.api.inputstate.export()
  first = state.first_number
  second = state.second_number

  result = first + second
  self.api.output("sum", result)

 


3. Casting your data into the exaptive data model.

 

self.api.value(my_variable)

 

Example entity:

 

def createPerson(self):
  state = self.api.inputstate.export()

  person = {
    "first_name": state["first"],
    "last_name": state["last"],
    "age": state["age"]
  }

  self.api.output("person", self.api.value(person))

 

Example multiset:

 

def getDuffle(self):
  myduffle = []
  for i in range(10):
    myduffle.append({
      "name": "person " + str(i),
      "age": str(i) + " year(s) old"
    })
  
  self.api.output("people", self.api.value(myduffle))



"I hit an error trying to cast into the Exaptive data model"


This is common. This means that your variable or data structure has something that doesn't play nicely with JSON. If you can encode your data into JSON, then it will work with the Exaptive data model. Note that you don't encode JSON itself into the Exaptive data model, but that if you can encode it into JSON, then it can also encode into the Exaptive data model. Sometimes you have to deal with unicode issues (python 2.7), sometimes there are special properties that you need to call `str` on (e.g. MongoDB ObjectIds), or maybe you need to create a dense array from a sparse array. 

Googling for "JSON encode <my_library_and_data_structure>" the approaches you find will probably help.