Cat or Dog? Check to see if my Fast.ai neural network can recognise whether your picture is a Cat or Dog
Integrated the Fast.ai app from lesson2 with this hugo site
No image selected
Test your Cat or Dog image above
Go ahead and try uploading cat and dog images. You can also try any other image and see what the classifier thinks of it!
Creating this post was a two step process:
- Create an html page with numerous prompts to chatgpt
- Integrate html snippet into hugo post with the help of this article.
I’ve asked chatgpt to generate a description of the html page for the remainder of the post.
Requirements of the HTML Page
The web page serves as a user-friendly interface for cat and dog image classification, with these key requirements:
- Image Upload Capability: Users can select multiple images through a file input element.
- Image Preview: Selected images appear in a preview panel at a reasonable size (300×300 pixels).
- Image Gallery: Thumbnails of all uploaded images are displayed in a horizontally scrollable container.
- Classification Results: After processing, the page displays both the primary classification result (“It’s a cat!” or “It’s a dog!") and the confidence scores for both categories.
- Interactive Selection: Users can click on any thumbnail to make it the active image for classification.
All of these features are implemented without requiring a page reload, creating a smooth, application-like experience right in the browser.
HTML Layout Understanding
The HTML structure is organized in a clean, functional way with several key sections:
-
Style Section: Contains CSS rules for the layout, including:
- Flex-based side-by-side containers for the prediction results and image preview
- Styling for the image thumbnails and the scrollable container
- Visual indicators for the selected image (border highlighting)
- Responsive design elements to ensure proper display across devices
-
Input Element: A simple file input control that accepts image files with the
multipleattribute enabled to allow batch uploads. -
Content Containers:
- The
side-by-sidediv creates a two-column layout containing:resultContainerfor displaying classification resultspreviewContainerfor showing the selected image at a larger size
imageContainerprovides a horizontally scrollable gallery of all uploaded images
- The
The layout elegantly balances functionality with visual appeal, providing clear separation between the main preview image, classification results, and the thumbnail gallery.
JavaScript Rationale
The JavaScript code handles the application’s dynamic behavior with a thoughtful approach to user experience:
-
Gradio Client Integration:
import { Client } from "https://cdn.jsdelivr.net/npm/@gradio/client@1.6.0-beta.3/dist/index.min.js";- The page connects to a Gradio backend (specifically “atomglitch/fastailesson2”) which hosts the trained fastai model
- This allows us to leverage a pre-trained machine learning model without needing complex backend infrastructure
-
Event-Driven Handling:
- The code uses event listeners to respond to user actions like file selection and thumbnail clicks
- This creates a responsive, app-like feeling in a simple webpage
-
Duplicate Prevention:
- A
Setdata structure (selectedFiles) tracks unique file data URLs - This prevents the same image from being added to the gallery multiple times
- A
-
Preview Management:
- The
setPreviewImage()function handles both updating the preview and triggering the classification - Visual feedback includes highlighting the selected thumbnail and displaying “Processing…” during classification
- The
-
Classification Process:
- When a new image is selected, the code:
- Converts the image to a blob format
- Sends it to the Gradio backend for prediction
- Parses and displays the results, showing both the main prediction and confidence scores
- Handles errors gracefully with console logging
- When a new image is selected, the code:
This implementation strikes an excellent balance between simplicity and functionality. It demonstrates how powerful machine learning capabilities can be integrated into web applications including this blog with relatively little code, creating an intuitive interface for end users to interact with a sophisticated image classification model.