Introduction
Today, we took significant steps toward improving our Pokémon card recognition model. Our goal is to create an application that can accurately identify Pokémon cards from images, leveraging a dataset of 17,000 high-quality images. Below is a detailed account of our journey, challenges, and solutions implemented to enhance our machine learning model.
Setting Up and Running the Model
Initial Setup and Issues
We began by running our data preparation script, data_prep_new.py, and encountered a few challenges related to missing libraries and incorrect paths. After resolving these issues by installing the necessary libraries (e.g., Pillow for image processing) and adjusting the file paths, we successfully prepared our dataset.
Training the Model
Next, we focused on training our model. We encountered the following issues and implemented solutions:
- Memory Allocation for GPU: Ensured TensorFlow was configured to manage GPU memory growth.
- Mixed Precision: Enabled mixed precision to optimize performance on supported hardware.
- Training with Early Stopping: Implemented early stopping to prevent overfitting and improve model robustness. This allowed us to save the best-performing model during the training process.
The modified training script included logging functionality to capture the training progress and feedback in text files for easier analysis.
Evaluation and Feedback
For evaluation, we used a separate script to test our model’s performance on a validation set. We added functionality to export evaluation results and feedback into text files, ensuring we could review detailed performance metrics.
Environmental Setup
Our environment setup was crucial for the successful implementation of the model. Below are the key components:
- Hardware: We used an iBUYPOWER TraceMesh 7 Gaming Desktop with a 14th Gen Intel Core i7-14700F processor and a GeForce RTX 4060 GPU.
- Software: The environment was set up on an Ubuntu Server 22.04, with Python 3.8, TensorFlow 2.x, and necessary libraries like numpy, Pillow, and scikit-learn.
- Data Storage: Dataset and logs were stored locally on the machine, with paths adjusted in scripts to ensure seamless data access.
Detailed Analysis of Training and Evaluation
Training Output
Epoch 1/10
881/881 [==============================] - 16s 16ms/step - loss: 9.8421 - accuracy: 7.0995e-05 - val_loss: 9.8977 - val_accuracy: 0.0000e+00
Epoch 2/10
881/881 [==============================] - 14s 16ms/step - loss: 9.2888 - accuracy: 0.0014 - val_loss: 9.9966 - val_accuracy: 0.0000e+00
...
Epoch 10/10
881/881 [==============================] - 14s 16ms/step - loss: 0.1235 - accuracy: 0.9705 - val_loss: 12.0594 - val_accuracy: 0.0929
Evaluation Output
Test loss: 12.059427261352539
Test accuracy: 0.09285815805196762
Feedback Logging
The model’s feedback mechanism identified low-confidence predictions and logged these for review. This will help us fine-tune the model by adjusting the training data based on user feedback.
Conclusion and Next Steps
Viability of the Use Case
With our dataset of 17,000 high-quality images, the model shows promise. However, to achieve high accuracy in real-world scenarios, we need to aim for a success rate of at least 95%. This can be achieved through continuous model refinement, including:
- Augmenting the dataset with more images.
- Fine-tuning the model based on user feedback.
- Experimenting with different architectures and hyperparameters.
Future Work
- Data Augmentation: Increase the diversity of the training dataset to improve model generalization.
- Model Tuning: Explore different neural network architectures and hyperparameters.
- User Feedback Integration: Implement a robust system for capturing and integrating user feedback into the training process.
- Deployment: Prepare the model for deployment in a user-friendly application.
By following these steps, we aim to create a highly accurate Pokémon card recognition system that meets user expectations and provides a seamless experience.
Stay tuned for more updates as we continue to refine and enhance our Pokémon card recognition model! In our next post, we will upload the changes to GitHub and provide a detailed review of each script.
Redfred is a systems engineer based in Seattle, WA, with a passion for technology, coding, and learning Japanese. When not working on tech projects, Redfred enjoys riding e-bikes and collecting vinyl records. Follow along for more insights into tech projects and development tips! And remember, all of this was created by ChatGPT-4o…even this post. 😉

Leave a comment