A Python prototype that converts 2D photos or text prompts into 3D models (.ply) using depth estimation and surface reconstruction.
-
Clone the repository and navigate to the project root.
-
Ensure your folders are structured like:
── CODE/
└── main.py
── DATA/
└── toy.jpg # or any input image
── RESULT/
└── Toy3D.obj # Output mesh
└── 3D_mesh_plot.png # Output Visualization png image
-
Create and activate your environment, then install dependencies:
pip install -r requirements.txt
-
Run the script (e.g., from Spyder or command line):
python CODE/main.py
-
torch – For inference with the depth estimation model.
-
transformers – To load the GLPN model (vinvino02/glpn-nyu).
-
Pillow – For image loading and resizing.
-
matplotlib – For visualization.
-
numpy – For processing image and depth data.
-
open3d – For creating 3D point clouds and meshes.
-
pyplot (TkAgg backend) – Used with Spyder IDE for inline plotting.
-
rembg – For automatic background removal from images.
-
onnxruntime – For running inference with ONNX models (required by rembg).
-
Loaded the input image and removed the background using rembg to isolate the foreground object.
-
Resized the image and mask to fit the GLPN model’s input requirements while preserving aspect ratio.
-
Used the pretrained GLPN model to generate a depth map from the masked image.
-
Post-processed the depth map by cropping padding and applied the foreground mask to exclude background pixels.
-
Converted the masked image and depth map into an Open3D RGBD image.
-
Created a point cloud from the RGBD data and cleaned it using statistical outlier removal to eliminate noise.
-
Estimated surface normals and reconstructed a 3D mesh using Poisson surface reconstruction.
-
Rotated the mesh for better viewing alignment and exported it as a .obj file for external visualization.
- Generated an interactive 3D plot of the mesh using Matplotlib for quick validation.
- Due to hardware limitations, I was unable to display the interactive plot directly on my machine. To ensure the results are still accessible and reviewable, I exported the 3D visualization as a PNG image file, which is included in the repository and showcased below.
- Input image
- Output 3D image
2025-05-06.22-14-37.mov
- Visualization
Note: The 3D mesh visualization was exported as a PNG file using Matplotlib, as my system could not render the interactive plot in real time. This approach ensures that the output remains accessible for review regardless of hardware limitations.