Skip to content

A Python prototype that converts 2D photos or text prompts into 3D models (.ply) using depth estimation and surface reconstruction

Notifications You must be signed in to change notification settings

Suvroneel/2D-to-3D-Image-Conversion-Using-Depth-Estimation-Open3D

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

3D Image reconstruction from 2D image

A Python prototype that converts 2D photos or text prompts into 3D models (.ply) using depth estimation and surface reconstruction.

1.Steps to run

  1. Clone the repository and navigate to the project root.

  2. Ensure your folders are structured like:

── CODE/

└── main.py

── DATA/

└── toy.jpg  # or any input image

── RESULT/

└── Toy3D.obj  # Output mesh
└── 3D_mesh_plot.png  # Output Visualization png image
  1. Create and activate your environment, then install dependencies:

    pip install -r requirements.txt

  2. Run the script (e.g., from Spyder or command line):

    python CODE/main.py

2. Libraries Used

  1. torch – For inference with the depth estimation model.

  2. transformers – To load the GLPN model (vinvino02/glpn-nyu).

  3. Pillow – For image loading and resizing.

  4. matplotlib – For visualization.

  5. numpy – For processing image and depth data.

  6. open3d – For creating 3D point clouds and meshes.

  7. pyplot (TkAgg backend) – Used with Spyder IDE for inline plotting.

  8. rembg – For automatic background removal from images.

  9. onnxruntime – For running inference with ONNX models (required by rembg).

3. Thought Process

1. Input Image Preprocessing

  1. Loaded the input image and removed the background using rembg to isolate the foreground object.

  2. Resized the image and mask to fit the GLPN model’s input requirements while preserving aspect ratio.

2. Depth Estimation

  1. Used the pretrained GLPN model to generate a depth map from the masked image.

  2. Post-processed the depth map by cropping padding and applied the foreground mask to exclude background pixels.

3. 3D Reconstruction

  1. Converted the masked image and depth map into an Open3D RGBD image.

  2. Created a point cloud from the RGBD data and cleaned it using statistical outlier removal to eliminate noise.

4. Mesh Generation

  1. Estimated surface normals and reconstructed a 3D mesh using Poisson surface reconstruction.

  2. Rotated the mesh for better viewing alignment and exported it as a .obj file for external visualization.

5. Visualization

  1. Generated an interactive 3D plot of the mesh using Matplotlib for quick validation.
  2. Due to hardware limitations, I was unable to display the interactive plot directly on my machine. To ensure the results are still accessible and reviewable, I exported the 3D visualization as a PNG image file, which is included in the repository and showcased below.

4. Input and Output

  • Input image

Image

  • Output 3D image
2025-05-06.22-14-37.mov
  • Visualization

Image

Note: The 3D mesh visualization was exported as a PNG file using Matplotlib, as my system could not render the interactive plot in real time. This approach ensures that the output remains accessible for review regardless of hardware limitations.

About

A Python prototype that converts 2D photos or text prompts into 3D models (.ply) using depth estimation and surface reconstruction

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Languages