The Computer Weekly Developer Network is in the engine room, covered in grease and looking for Artificial Intelligence (AI) tools for software application developers to use.
This post is part of a series which also runs as a main feature in Computer Weekly.
With so much AI power in development and so many new neural network brains to build for our applications, how should programmers ‘kit out’ their AI toolbox? How much grease and gearing should they get their hands dirty with… and, which robot torque wrench should we start with?
Martin says that when it comes to deep learning, which is probably the most prevalent form of AI at the moment, much of the complexity has already been abstracted.
“As a user of deep learning you don’t usually need to go and write training algorithms, since ‘stochastic gradient descent with ****’ is usually what you want. You might want to think about your loss function, but beyond that you don’t really need to think how you will train your network,” said Martin.
He explains that in addition, it is important to remember that there are many different intelligence layer types already implemented in openly available libraries, so developers should really view any single layer of intelligence as a whole set of computing neurons connected together.
“So, as a user of deep learning you really are piecing together layers, thinking about the data you will train them on… and then doing the training. I’m not going to underestimate the difficulty of coming up with appropriate network architectures for the layers – but since that is problem-dependent, it seem reasonable that someone who wants to focus on building smarter applications needs to think about their deep network architecture,” said Martin.
Martin points out that much of the time and effort in developing new deep learning models involves data cleansing, data labelling and data marshalling.
He insists that the training data part of the total equation is something that hugely affects the overall performance of any model and is something that must be considered a first class citizen alongside the network architecture and training algorithms.
So talking of responsibilities, is there a responsibility to use open source frameworks to share the machine knowledge made so that it’s a case of deep learning for all?
Martin says that there may be a responsibility to make sure your model can be shared across different frameworks, but not a responsibility to use any particular framework whilst developing your model.
“There is software and deep learning tools which allow both import and export of our deep networks to the Open Neural Network Exchange Format (ONNX – https://onnx.ai/). This allows different frameworks to use models developed in others. Developers also now have the ability to import some model developed in TensorFlow and reuse the network architecture and some parts of the model when undertaking transfer learning on a new dataset,” he said.
Martin concludes by saying that the interesting thing that is developed is the model and its architecture. He asserts that ONNX is an excellent way for all the different frameworks to interconnect to allow the right tools to be used in the right situation.