Artificial intelligence (AI) is rapidly transforming our world, from the call centers we call to the chatbots we interact with on websites. AI is already having a significant impact on our daily lives, whether we're shopping, studying, or working.
Technological advancements are supporting the growth of AI production. Large-scale research is making computer technology cheaper and more efficient, and access to more powerful hardware is making it easier and more affordable to train and develop AI systems. As a result, more sophisticated AI systems are being developed to solve increasingly complex problems.
AI is basically machine software that imitates human intelligence. AI based programs can see, read and understand pictures, text, video, emotion, audio. It can also smell, touch, move after it has been developed using various algorithms and machine learning modules. Machine learning is an artificial intelligence-based method for creating intelligent computer systems. Artificial intelligence and machine learning technologies are also used in self-driving cars, medical care, forensics, and also other industries like education, internet security, business, supply chain, and logistics.
This blog provides a comprehensive overview of how we use artificial intelligence in our daily lives, even though we may not be aware of it. It also explores how AI is likely to change our future in dramatic ways.
The outcome of this blog is a single-page web application where we can experience how artificial intelligence works in the browser (e.g object detection and image classification). To develop this web app popular deep neural networks that are trained with large-scale data sets, single shot multibox detector, and other important tools like HTML, CSS, Javascript, React, TensorFlow.js, ML5.js were used.
This section contains a description of all the technologies used during the process of making the web application.
This part of the blog covers the entire web application development process, conceptualizing the architecture of the web app and from coding to deployment.
Implementation: The project's goal was to create a single-page web app that could classify objects based on their category and make segmentation of an image. Many of the tools were downloaded prior to the actual implementation. The used text editor for this project was VSCode (Visual Studio). This is a small and powerful source code editor that runs on a PC and all major operating systems supported (Windows, Linux, and Mac OS). VSCode editor has integrated support for JavaScript and receives extra features for React, making it ideal for this application. GitBash was used as the primary terminal and it's a Windows framework that offers work flexibility. During the development process, the development server was also run using GitBash.
Figure 1. Screenshot for importing libraries
Line 1 is for loading react to the document file, line 2 is for loading the coco-ssd model. This model detects objects defined in the COCO dataset. Line 3 load TensorFlow.js (Figure 1).
Figure 2. Screenshot for video and canvas reference
Create references for video and canvas. This is to be able to manipulate the video and the canvas which is responsible for showing the information from the webcam to the webpage (Figure 2).
Figure 3. Screenshot for componentDidMount
componentDidMount is a feature called after a component has been assembled (inserted into the tree). This is where we can put any initialization that includes DOM nodes. This is a good place to start a network request from a remote endpoint if data need to be loaded. This lifecycle is used to load the webcam and start the stream in this case (Figure 3).
Figure 4. Screenshot for componentWillUnmont
componentWillUnmount is a function invoked immediately before a component is unmounted and destroyed. We perform this lifecycle for cleaning up any subscriptions that were created in componentDidMount(). In this case, we need to destroy the detectFrame so it doesn't run all the time (Figure 4).
Figure 5. Screenshot for detectFrame
This function is responsible for making predictions based on what the webcam/camera is seeing. It compares it to the coco-SSD model to make predictions (Figure 5).
Figure 6. Screenshot for package.json
Every npm package includes a package file, which is typically located in the project root. JSON file contains various project-related metadata. It is used to provide information to npm so that it can define the project and manage its requirements (Figure 6).
Figure 7. Screenshot for renderPredictions
This function creates a box for the detected object and shows the prediction based on the model (Figure 7).
Figure 8. Screenshot for CSS
Styling the object detection class and canvas, which are CSS code (Figure 8).
Figure 9. Screenshot for classifyImg
ml5.imageClassifier() is a method for creating an object that uses a pre-trained model to identify the content of an image. It's worth noting that the example uses a pre-trained model that was trained on a database of around 15 million images. The cloud model is accessed using the ml5 library. What is included, omitted, and how those images are labeled or mislabeled, are all entirely dependent on the training data (Figure 9).
Figure10. Screenshot of image recognition
The Image Classification part can be seen in the screenshot above (Figure 10). The user can upload a picture from their device to the web app, and the application will predict the object in the image they uploaded. It analyzes images only in the browser and does not save them in the archive. This page showcases image classification in the web app that highlighted browser-based deep learning (or ML) where MobileNet and ImageNet were used which are CNN and image dataset respectively.
Figure 11. Screenshot of object detection
The screenshot in Figure 11 shown is browser-based machine learning which can detect objects using the COCO-SSD model.
Alin Bhattacharyya is a “Full Stack” Enterprise Architect heading the Frontend practice at Coforge, with over 20 years of experience in Software Engineering, Web and Mobile Application development, Product development, Architecture Design, Media Analysis and Technology Management. His vast experience in designing solutions, client interactions, onsite-offshore model management, and research and development of POC’s and new technologies allow him to have a well-rounded perspective of the industry.