Deep Learning Augmentation Data Deep Dive

Duc Haba
23 min readOct 31, 2021


Update Jan 13, 2021: “Reducing Data Biases By Augmentation,” an ongoing real-world project (section #U.1 )

1 — Introduction

Welcome to the “Deep Learning Augmentation Data Deep Dive” (AUD3) project. It is a new journey in the “Demystify AI” series.

Figure 1.

These journeys have in common that the problems are taken from the real-world Artificial Neural Network (ANN) project. I have worked on them, coded them, cried over them, and sometimes had a flash-of-insight or an original thought about them.

The journeys are a fun and fascinating insight into the daily work of an AI Scientist and Big-Data Scientist. They are for colleagues and AI students, but I hope that you, a gentle reader, would enjoy them too.

The logic behind data augmentation is uncomplicated. It would be best to have more pictures to increase the ANN model accuracy, and data augmentation gives you more images.

The AUD3 is a hackable, step-by-step Jupyter Notebook. It is for learning about data augmentation and selecting the correct parameters in the ANN image classification and image-segmentation projects. You can skip ahead to the result-cells and not read the math and the code-cells. The math is elementary, and coding is straightforward logic, so try it out. You might enjoy “hacking” along the journey.

Data augmentation increases the training images by a factor of 2 to 16 or more. For ANN, that means the model achieves better accuracy with more epochs and without overfitting.

For example, I was working on an AI project for a large casino hotel. The goal is to classify every customer into one of the 16 categories as they walk through the door. In other words, it is not to identify that guy walking through the door as “Duc Haba” but to classify him as a “whale (A-1)” category, i.e., a big spender.

As you have guessed, the first and most significant problem is the lack of labeled pictures. I need millions of tagged photos because of human diversity in race, ethnicity, clothing, different camera angle, and so on.

ANN is not a ruled-based expert system. For example, a person wearing a Rolex watch is an “A-1”, or a guy with no shoe and no shirt is a “D-4” category. ANN does not use rules, so it needs millions of labeled images to train and generalize so the ANN model can correctly classify a guy who enters the casino for the first time. In ANN’s lingual, it means the ANN model is not overfitting.

I classify the AUD3 as a “sandbox” project. In other words, it is a fun, experimental project focusing on solving one problem.

So if you are ready, let’s take a collective calming breath … … and begin.

2 — The Journey

As with the previous journey, the first step is to choose and name your dog companion. With the project code name AUD3, the natural name has to be “Wallaby.”

Figure 2. Your guide, Wallaby

Typically a canine name is chosen, e.g., “Lefty,” “Roosty,” or “Eggna,” but be warned, don’t name it after your feline because a “cat” will not follow any commands.

If you are serious about learning augmenting data, start hacking by changing the companion name to your preference. Jupyter notebook allows you to add notes, add new codes, and hack Wallaby’s code.

If you are a friend tagging along, you will like Wallaby. He is a friendly, helpful dog. He will do all the tedious work in good spirits, and he relishes to hop around. As a good little programmer, Wallaby (or insert your companion name here) starts by creating an object or class.

Wallaby will skip copying the code-cells here. Please visit the “AUD3_augmentation_data_deep_dive” Jupyter notebook on GitHub to view the code. However, he will insert the output as an image.

2 — Section 2 & 3 — Wallaby Class

The following Jupyter Notebook is a clean version. Wallaby cleans up the tried-and-errors cells, but please don’t let it stop you from inserting your code-cells and notes as we make this journey together.

Figure 3.

When copying the code into Atom’s project, Wallaby would add the methods during the class definition, but in a notebook, he will hack it and add new functions as needed.

Monty is like Wallaby. He is a Python class refactored in the Atom project and stored in GitHub. Monty is an alpha-dog that follows the same methodology. He hacked it in a Jupyter Notebook and then copied it into a Python Atom project.

Monty is not a public Github project at this stage. However, Monty’s code exists in many of Duc Haba’s sandbox projects on Github. For this journey, Monty’s ability to draw 2D, 3D graphs and image-cleaning will be handy. They were from previous projects,, the “Demystify Python 2D Charts,” and the “3D Visualization” sandbox projects.

2 — Section 4 — Fetch Images Data

Figure 4.

Wallaby has a companion named Monty. He will do all the dirty works that do not directly pertain to this journey. If we spend time teaching Wallaby, then it will distract from the AUD3 journey.

Wallaby encourages you to hack the notebook and use your image data set. His first task is as follows.

  1. Fetch the landscape image set.
  2. Fetch the cityscape image set.
  3. Fetch the people’s face image set.
  4. Fetch the satellite image set.

Wallaby randomly pulls the images from “Google” or “Bing” image-searches. He uses the Chrome web store “Download All Images” extension to download and pack them into a zip file. Wallaby claims no rights on these pictures. He uses them only for research purposes.

2 — Section 5 — Inspect Images

Figure 5.
Figure 6.
Figure 7.

Wallaby relies on Monty to do the photos inspection and cleaning. It is a prelude to doing data augmentation, but it is not essential to teach Wallaby. It would be distracted from the AUD3 journey.

Send Wallaby’s human companion an email or post a comment if you want Wallaby to do a Jupyter notebook sandbox project about image inspection and cleaning. Wallaby repeatedly runs below code-cells to see a random image set. (Figure 5.)

Wallaby could limit the thumbnail view to only one image, e.g., “faces.” (Figure 6.)

Wallaby wants to inspect one file at a time. (Figure 7.)


















2 — Section 6 — Clean The Images

Data cleaning does not affect the data augmentation outcome, but Wallaby is an organized nutcase. He likes things square and neatly lineup. And since Monty knows how to do cleaning, it will not distract from the journey goals.
Once again, if you like to see an in-depth Jupyter Notebook about data cleaning (or cleansing), labeling, aggregating, identifying, and wrangling, contact Wallaby human companion.

Figure 8.
Figure 9.
Figure 10.
Figure 11.

Wallaby starts with picking the image size. It is not as easy as it sounds because the photo’s size is tightly coupled with the goal of the ANN classification model and the subject matter. If the goal is simple as to classify handwriting digit, then 64x64 pixel is sufficient.

What is the optimal photo size for ANN classification of the ethnicity of older adult’s faces? Clearly, the 2024x3040 pixel size is too big, and the 64x64 pixel size is too small. Should it be 400x400, 800x800 or 1024x1024 pixel size? Or does it even matter if Wallaby chooses any of the mid-range sizes? Is higher resolution always yield a better accuracy rate? There is no clear answer to any of the above questions.

There are practical considerations that need to take into account. The photo size 2024x3040 pixel yields 6,152,960 input nodes per image. When Wallaby multiplies the number of input nodes with the batch-size, the epoch count, and the number of layers in the ANN architecture, the result is a pretty large number. There is no GPU card with that big of RAM available on the open market. Eight, twelve, and sixteen Gigabytes of GPU RAM is the norm for ANN projects.

Wallaby hates to willy-nilly choose a size because he is trained as a canine computer scientist and not a psychologist. Maybe he should wait for his friend Magpiena to complete her master thesis. :-)

For now, Wallaby chooses the size 448x448 pixel size because ResNet architecture uses the formula (2^n * 7) where “n” is an integer starting with one, i.e. (2⁶ * 7) = 448. Wally wants all photos to be square and centered. (Figure 8.)

Wallaby is right. Monty cleans well. While at it, the file names are too long, and there is no embedded information in the file name, so Wallaby wants to make them all nice and tidy. :-) (Figure 9.)

Wallaby double-checks Monty’s handy works. (Figure 10 & 11.)

That is nice. All images are squared, the same size, centered, and neatly named, e.g., “city1.jpg, city2.jpg, face1.jpg, face2.jpg, land1.jpg, land2.jpg”, and so on. Thank you, Monty.

Wallaby ready to dive into the heart of the journey. He will turn the 300 images into thousands of photos without compromised the subject. The big bonus is the reduction in storage size. The four original zip files (unzip) total size is 412 MB, and after cleaning, the total size is 57 MB. It is a whopping 86% reduction in file storage.

2 — Section 7 — Flip, Augment Image Data

Wallaby starts with flipping the photo left, right, and up and down. In other words, Wallaby will flip the image horizontally along the “Y-axis” and then flip the picture vertically along the “X-axis.”

The goal is to see which image set, the “faces, cityscape, landscape or satellite,” can be flip horizontally, vertically, or both without compromised the subject. For example, a building can’t be flipped upside-down.

Figure 12.
Figure 13.
Figure 14.
Figure 15.

Wallaby may not need to write the code to illustrate flipping. However, the code is easy to write, and Wallaby can’t tell the difference between a building or a fire hydrant. After flipping images, there is a percentage of tilt, skew, warp, and light-contrast, which are harder to judge.

Wallaby uses the “” library for flipping, and the flip vertically is more like rotate 90 degrees.
After re-running the “flipping faces” a few times, it’s clear that people’s faces can be flip right-to-left without harm, but flipping vertically or rotating 90 degrees may be acceptable because kids can hang upside-down from a tree. (Figure 12.)

It depends on the ANN use-cases. For example, if the use-case allows kids to identify people’s ethnicity using their iPhone while walking in the park, kids will sometimes turn their phone sideways or upsidedown. Therefore, rotating the photo is a viable option.

For “landscape,” flipping horizontal is good, but flipping vertically or rotating 90 degrees doesn’t look right. If Wallaby is a World War II fighter pilot then seeing the landscape upsidedown or sideway is normal. (Figure 13.)

After repeatedly running the above “cityscape” code-cells, Wallaby found the same result as “landscape” images. Flipping image horizontally is OK, but not vertically. (Figure 14.)

Wallaby runs the above “satellite” code-cells over and over again, but he can’t tell right from wrong in the satellite photos. Therefore, both flipping right-and-left and up-and-down are valid options. His friend, Magpiena, is better at it because she can fly. (Figure 15.)

2 — Section 8 — Warp, Augment Image Data

To what degree of warping and how often are the two variables control the “warp” feature. The “degree” is a float number ranging from 0 to 1. Wallaby categories the “low warping” is between 0.0 and 0.15 degree, “mid warping” is between 0.0 and 0.40, and “high warping” is between 0.0 and 0.95.

If you are hacking the notebook and Wallaby 100% encouraging it, change it repeatedly until you find the value best optimize for a particular image data set.

The “how often” variable is the probability when the system should use the “warp” feature. It is a float number ranging from 0 to 1 where zero doesn’t use it, and one uses it all the time. For example, the probability of 0.75 means uses the “warp” feature three out of four-time.

Figure 16.
Figure 17.
Figure 18.
Figure 19.

Faces warping is subjective. Wallaby found that people are getting fatter or skinnier, and “low warping” seems to have that same effect. The person’s facial appearance does not change, and so Wallaby still recognizes the same face. (Figure 16.)

Once again, it depends on the ANN use-case. For example, if the use-case is to identify a person, then “low warping” is very useful. If that person gets fatter or skinnier, the system has no trouble recognizing him/her. It is the same for different camera angles that make a person’s face squashed or stretched. “Mid or High” warping is too extreme for people’s faces.

Cityscape and warping features are not the right combination. Wallaby doesn’t want to see buildings leaning and swaying unless he is living in Pisa, Italy. (Figure 17.)

Wallaby couldn’t think of a good use-case where augmenting cityscape images with warping is warranted. Eating magic mushrooms causes Wallaby to see swaying buildings, but that is a different journey. :-)

Nature doesn’t come in a straight line, so warping landscape photos are a viable option to increase the number of images for the ANN training session. (Figure 18.)

The question is how high can Wallaby warps the photos before it affects the image integrity. After running the “landscape” code-cells repeatedly, Wallaby found that “Low” warping is sufficient, and “Mid” warping is sometimes OK. Therefore, Wallaby thinks the best value for distortion is 0.35. What is the optimal value that you find?

Satellite or aerial photos are a mixture of landscape and cityscape, where people like to build in a straight line, and nature like anything goes. (Figure 19.)

It is a clear case where the ANN project’s use-case or goal dictates to use or not to use the warp feature. For example, suppose the goal is to identify roads, buildings, streets, rivers, forests, and mountains. In that case, warping is a viable option, but if the goal is to identify the city from the aerial photos, then warping is not a good option. It is because bending and straightening of the streets are the characteristics of a particular city.

2 — Section 9 — Tilt, Augment Image Data

Figure. 20.
Figure 21.
Figure 22.
Figure 23.

Another fun augmentation feature is tilting the photo left or right. The tilting degree range is typically between -45 degrees and 45 degrees, but Wallaby can spin it around from zero to 360 degrees.

The “how often” variable is the probability when the system should use the “tilt” feature. It is a float number ranging from 0 to 1 where zero doesn’t use it, and one uses it all the time. For example, the probability of 0.75 means uses the “tilt” feature three out of four-time.

For face photos, “Low and Mid” tilting is natural because people are tilting their heads. (Figure 20.)

For cityscape photos, tilting is depending on the use case. For example, if the use case involves people biking around a city taking photos, then “Low to Mid” tilting is a good option. If the use case is taking pictures from a fixed camera mounted on a car, tilting may not be viable. (Figure 21.)

For landscape photos, nature is curvy, and so “Low to Mid” tilting looks natural. (Figure 22.)

For satellite photos, Wallaby’s friend Magpiena sometimes flies the old “CAC Boomerang” aircraft, and her photos are anything but straight. Therefore, tilting is a good image augmentation option for satellite photos. (Figure 23.)





2 — Section 10 — Zoom, Augment Image Data

Figure 24.
Figure 25.
Figure 26.
Figure 27.

For image augmentation, the zoom feature is more effective when the pictures are squared and centered. The typical range is a float number between 1.0 and 2.0, where 2.0 means double the image size. It is hard to imagine what use-cases would require Wallaby to increase the image size by a factor of three, four, or five.

Since the face photos are portrait, squared, and centered, “Low or Mid” zoom is acceptable. (Figure 24.)

In cityscape photos, “Low” zoom is reasonable. The degree of zooming is dictating by the use-case. (Figure 25.)

In landscape photos, the zoom feature and the factor of zoom level are depending on the use-case. The images are of a wide-shot; therefore, zooming up to the “Mid” level is acceptable. (Figure 26.)

In satellite photos, a typical use-case is to classify multi-labels such as buildings, rivers, bridges, churches, roads, forests, meadows, cornfields, and so on. If Wallaby uses the zoom feature, he might miss the building on edge and would not label the image correctly. (Figure 27.)









2 — Section 11 — Light, Augment Image Data

Figure 28.
Figure 29.
Figure 30.
Figure 31.

There are many more image augmentation methods, but Wallaby will try the “light-effect” as the last one. The “light” factor is a float number between 0 and 1.0, where a higher number implies more effects.

The “how often” variable is the probability when the system should use the “light-effect” feature. It is a float number ranging from 0 to 1 where zero doesn’t use it, and one uses it all the time. For example, the probability of 0.75 means uses the “light-effect” feature three out of four-time.

For people’s face photos, the light-effect changes their skin color from a lighter tone to a darker tone or vice-versa. Wallaby found that the “Low” light-effect might be acceptable because people are tanner in the Summer and whiter in the Winter. (Figure 28.)

Once again, the use-cases prescribe whether to use or not to use the light-effect filter. For example, if the use-case is for classifying a person’s identity, then light-effect is meaningful. If the use-case is for classifying a person’s ethnicity, then light-effect is not a good option.

For cityscape photos, the light source is the sunlight; therefore, light-effect is a valid option to extend the image set through augmentation. Wallaby recommends the “Low and Mid” light-effect option. (Figure 29.)

The landscape photos are the same as the cityscape photos when it comes to light-effect. They are both lit by sunlight, so the “Low and Mid” light-effect level is sufficient. (Figure 30.)

For satellite photos, as said before, if the use-case is to classify multi-label landmarks, then too much light will give the effects of a hazy day. “Hazy” is one of the multi-label, so using the light-effect option might provide a false-positive. (Figure 31.)

2 — Section 12 — Put It All Together

From the above five augmentation features, Wallaby will define the optimal combined values for the transformation on each image data-set.

Figure 32.
Figure 33.
Figure 34.
Figure 35.
Figure 36.

The coding is super simple. It is the experience from running the code-cells repeatedly that counts. Is tilting 0.3 factor better than 0.25 or 0.35 for faces? There is no mathematically calculated optimal value. It is a judgment call, and it varies from person to person. The salient point is coding-it gives data scientists a common platform to discuss the differences logically.

A Jupyter notebook is a superior tool to other media, such as blogs, whitepapers, video-streamings, or podcasts. It is because Jupyter notebook enables interactivity both in real-time or in self-paced mode.

For example, Wallaby stated that 0.245 is the optimal value for light-effect in the cityscape image set. You can repeatedly run the code-cells to verify Wallaby’s claim. You should disagree with Wallaby because he is a dog, and canine can only see the blue-and-green color shade. :-)

White papers, blogs, or video-streamings can not facilitate the above interaction, but Jupyter notebook is made for the above interactivities.
Wallaby will draw the data-bunch for each image data set. There is an option to turn on or off the transformation feature.

That is super cool. Every time Wallaby runs the above code-cells, the system displays a different set of images. The system shuffled the pictures as a data-bunch would request it, but the central point is that the system randomly transformed each photo based on the overall set parameters. Therefore, Wallaby can train the ANN model with more epochs and not be overfitting. (Figure 32 & 33.)

What are your optimal values for the face image set? Are they different from Wallaby’s parameters?

Wallaby can run the above-code cells with “is_transform=False” so that he can view the data-bunch images as-is. It is a good practice to check that Wallaby has not compromised the subject in the data set.

For cityscape data-bunch, Wallaby ran the above code-cells a dozen times, and he can’t see any differences, but that is the desired outcome. The data set is augmented to increase the size by a factor of 12 or more, and to a canine (and human) view, the system does not change the integrity of the data set. (Figure 34.)

For the landscape data-bunch, Wallaby may go overboard with the image transformation. But it is nature, and weirder things can happen. Who knows how a canine perceives a tree or a lake? (Figure 35.)

Do you agree with Wallaby?

For the satellite data-bunch, Wallaby chooses safe parameters. It still looks like a drunken crop-duster taking aerial photographs. Magpiena assured Wallaby that is how she sees the world. No wonder that dog and bird do not get along. (Figure 36.)

Wallaby did turn the transform feature on and off for the data-bunch, but that doesn’t help him. Birds are crazy, but not as crazy as cats. :-)

3 — Wrap-Up

The “AUD3” journey begins with Wallaby invites Monty because he is an expert in cleaning data. Wallaby directs Monty to download the four image data set, the older adult faces, the cityscape, the landscape, and the aerial satellite photos.

Even though data augmentation does not require the pictures to be squared, centered, and uniform in size, Wallaby, insists on having all images resized to be 448x448 pixel, centered, and normalized the file names. He likes being organized and tidy.

The goal of data augmentation is to increase the number of images for training an ANN model. The fundamental truth is that the more images available for training, the higher the accuracy rate and the lower the chance for overfitting.

Wallaby has fun with the “flip, warp, tilt, zoom, and light-effect” augmentation feature. Wallaby writes the code to illustrate each effect fully. He plays with it over and over again. He increases the parameters a little bit then decreases them a little bit. He runs it repeatedly until he has a firm grasped of the effect on the photos.

The code is so easy to write, so the effort is mostly in viewing the photos and discerning how much flip, warp, tilt, zoom, and light-effect would be sufficient. The choices of which feature to use and how much is subjective. The ANN model goals and use-cases dictate which augmentation features are appropriate, but how-much is based on the data scientist.

Wrapping up the journey, Wallaby draws the data-bunch for each image data set with the optimal transformation parameters and without it. He encourages everyone to hack the Jupyter notebook, add their images, and choose the augmentation parameters.

Figure 37.

Wallaby notices there is a lot of common sense used throughout the journey. Assign from the coding, math, and terminology, there is no magic, and like most professions, common sense backed by data is the best policy.

Fun, interactivity, and partner are the requirements for a good hike, as well as for an effective learning session. That is why Wallaby is a trained hiking dog. He is not a tea-cut dog or a pampered family dog. He likes hiking, listening to (digital) commands, respecting nature, and knowing not to harm wildlife unnecessarily.

Wallaby is looking forward to the next journey and hopes to see you again.

The End.

Update 1 — “Reducing Data Biases By Augmentation,” an ongoing real-world project

Wallaby hue-mon {human} friend’s company had approved the “reducing data biases by augmentation” (RDBA) project. Due to privacy and intellectual property, Wallaby’s friend can’t name the company and the actual project name.

The project scope is similar to randomly adding graffiti to the road sign images. The image data’s bias is the photos are of the clean street signs, and therefore when encountering the real-world road signs with graffiti, the AI might read it wrong, i.e., give a false-positive. In other words, the bias is pronounced when driving in Wallaby’s home town of San Francisco or Oakland. It is because the road signs are rarely clean or without graffiti.

Figure 38.
Figure 39.
Figure 40.
Figure 41.

If the above AI system uses in a self-driving vehicle system, then the result is a catastrophic failure. People’s lives could be lost. The car would not know to stop at the stop sign or not reading the merging lane yield sign correctly when driving in San Francisco.

Here is an example of a stop-sign. It is a photo in the “train” data set. (Figure 38.)

The goal is to augment the data-image with graffiti in three levels, “low, medium, and high.” (Figure 39.)

Medium augmentation added graffiti. (Figure 40.)

High augmentation added graffiti. (Figure 41.)

He took Wallaby’s AUD3 notebook, expanded it, and added the reference to the “How Hackers Could Fool Artificial Intelligence” podcast by Dina Temple-Raston on Shortwave NPR, Sep. 21, 2020.

He does it on a Jupyter Notebook and not on PowerPoint, presents it to the executive staff, and this week, gets the green-light to start the project. :-)
If you have other success in hacking the AUD3 notebook, please share it.

3 — Conclusion

AUD3 is a visual journey with little coding and math. It is fun to write and code. Data augmentation substantially impacts the ANN model accuracy, yet not many data scientists or AI students are using it.

Jupyter Notebook is the best tool for learning because it enables interactivity and individuality. Jupyter Notebook is a cloud-based web-app. There is no need to install Python other app or extension, and Google Colab with a GPU is free. It is accessible through a web browser. You can access a Jupyter notebook using laptops, tablets, Android phones, or iPhones. There are times that I start the ANN model training cycles and view the result on my iPad. The notebook file, *.ipynb, is a JSON object. It’s that simple.

Once you internalize the knowledge, concepts, and methods by doing it via coding or writing, you have a solid foundation to move forward. You may discover a new knowledge nugget or an epiphany.

As the author, I layout the journey beginning, middle, and ending, but through Jupyter Notebook, the readers can, and even encourage to, hack the story by adding notes, modify the codes, or write the new detour.

For example, after I preview the AUD3 notebook to a friend, he took the notebook and expand it to include the “inject known noises” image augmentation feature. Assuming a scenario where the ANN must read the street road sign correctly, e.g., stop sign, 40 MPH sign, merge left sign, and so on. He wants to randomly add graffiti, stickers, or buckshot to the road sign pictures. Therefore, during the real-world application, the ANN would recognize the road signs correctly.

I wish that his company allow him to publish his Jupypter notebook. If he posted it on a whitepaper, company blog, or article, it is like giving me a car without an engine.

Incidentally, the “inject known noises” touches on a salient point of data bias. Without knowing the full scope, there are “known” biases in the road sign ANN project. For one, it biases for clean cities with pristine road signs, e.g., Frankfurt or Copenhagen. The system will fail in my hometown of San Francisco or Oakland, California.

There you have it. We found a scenario where data scientists could use data augmentation to lessen the data biases.

Data scientists are not the only ones that have an epiphany. As Wallaby said, the AUD3 journey is full of common sense approaches backed by data. Any gentle readers might have a new idea about data augmentation. If you do and are not comfortable with coding, contact me via LinkedIn, and maybe my colleagues or I could help you.

Figure 42.

The demystify AI series is for having fun sharing concepts and code with colleagues and AI students, but more than that. It is a peek into a daily workaday problem of AI scientists.

Not all AI programmers work on Google, Facebook, or government Defense Department large-scale omnipotent AI. If we are in the Star Trek universe, these are not the captain or chief engineer’s logs. They are the logs of an ensign in the lower deck scrubbing Jeffrey tubes.

I hope to see you on the next Demystify AI journey.


2020 was the year where fake news and misinformation became mainstream. Unfortunately, I have read too many highly polarized articles about mistrusting AI on social media and the mainstream news channels. These fears are misplaced, misinformed, and fractured our society.

Duc Haba, 2020

Doing nothing is not the same as doing no harm. Therefore, we can’t opt out and do nothing, so I take baby steps. The notebooks are not about large-scale omnipotent AI. Instead, they demystify AI by showing the mundane problems facing AI scientists in a real-world project. It is like demystifying crabs-fishermen by watching the TV series “Deadliest Catch.”

“Do no harm by doing positive deeds” is the foundation reason why I write and share the demystify AI series. The articles are for having fun sharing concepts and code with colleagues and AI students, but more than that, it is for building trust between AI scientists and social media.

I hope you enjoy reading it, sharing it, and giving it a thumb-up.

<<the article was published in LinkedIn>>

Demystify AI Series

  1. Hot off the press. “AI Start Here (A1SH)” — on GitHub (July 2021)
  2. Book Study Group” on LinkedIn — on GitHub (January 2021)
  3. Deep Learning Augmentation Data Deep Dive” — on GitHub (December 2020)
  4. “Demystify Neural Network NLP Input-data and Tokenizer” on LinkedIn | on GitHub (November 2020)
  5. “Python 3D Visualization “ on LinkedIn | on GitHub (September 2020)
  6. “Demystify Python 2D Charts” on LinkedIn | on GitHub (September 2020)
  7. “Norwegian Blue Parrot, The “k2fa” AI” on LinkedIn | on K2fa-Website (August 2020)
  8. “The Texas Two-Step, The Hero of Digital Chaos” on LinkedIn (February 2020)
  9. “Be Nice 2020” on Website (January 2020)



Duc Haba

AI Solution Architect at Prior Xerox PARC, Oracle, GreenTomato, Viant, CEO Swiftbot, CTO