6.3 数据集

数据集角色校对说明

当前文件是 MinerU markdown 关键词证据抽取,不等于论文真实使用数据集。出现次数应理解为 mention_count,需要继续判定 train / test/eval / baseline_pretrained_source / reference_only / mentioned_only

COCO、ImageNet、LAION 等通用数据集尤其容易只是基础模型来源或参考文献提及,默认需要人工复核。

数据集官方字典初稿

official_url=needs_official_url 表示还要补官方链接/许可;该表先用于统一 canonical、alias、模态和场景域。

dataset aliases dataset_type scene_domain panorama_native common_usage common_metrics description official_url license
COCO COCO generic_reference_or_pretrain generic images no_or_needs_check pretrain_or_reference_source not_sota_eval_by_default generic image dataset; usually reference/pretraining unless explicitly evaluated needs_official_url needs_manual_check
Gibson Gibson real_3d_scan_render_source indoor scenes no_or_needs_check rendered_panorama_or_view_synthesis_source FID/CLIPScore/LPIPS/depth_or_geometry_metrics_protocol_dependent 3D scenes commonly rendered for embodied/indoor view synthesis tasks needs_official_url needs_manual_check
HDRI-Skies HDRI-Skies hdr_environment_maps HDR environment maps no_or_needs_check lighting_or_hdr_environment_source FID/LPIPS/visual_quality_protocol_dependent HDRI/environment map source; verify exact subset and license needs_official_url needs_manual_check
HM3D HM3D real_3d_scan_render_source indoor scenes no_or_needs_check rendered_panorama_or_view_synthesis_source FID/CLIPScore/LPIPS/depth_or_geometry_metrics_protocol_dependent Habitat Matterport 3D scenes; often rendered into panorama/cubemap data needs_official_url needs_manual_check
Humus Humus hdr_environment_maps HDR environment maps no_or_needs_check lighting_or_hdr_environment_source FID/LPIPS/visual_quality_protocol_dependent HDRI/environment map source; verify exact subset and license needs_official_url needs_manual_check
ImageNet ImageNet generic_reference_or_pretrain generic images no_or_needs_check pretrain_or_reference_source not_sota_eval_by_default generic classification/image dataset; usually feature/pretraining reference needs_official_url needs_manual_check
LAION LAION web_scale_pretrain_source web image-text no_or_needs_check pretrain_or_reference_source not_sota_eval_by_default pretraining source for foundation models, not normally a panorama evaluation dataset needs_official_url needs_manual_check
LAVAL Indoor LAVAL;LAVAL Indoor;Laval Indoor real_hdr_indoor_panorama indoor HDR panoramas yes panorama_generation_train_or_eval FID/LPIPS/visual_quality_protocol_dependent indoor panorama/HDR environment map dataset; often appears in NFoV-to-panorama evaluation needs_official_url needs_manual_check
Matterport3D Matterport3D real_indoor_rgbd_panorama indoor scenes yes panorama_generation_train_or_eval FID/CLIPScore/LPIPS/depth_or_geometry_metrics_protocol_dependent real RGB-D scans; commonly used for indoor panorama generation/evaluation needs_official_url needs_manual_check
Pano360 Pano360 mixed_panorama_collection mixed panoramas yes panorama_generation_train_or_eval FID/IS/CLIPScore/LPIPS/projection_specific_metrics paper-dependent panorama collection; verify construction and split needs_official_url needs_manual_check
Pano3D Pano3D paired_panorama_depth_or_3d panorama/depth scenes yes needs_manual_check FID/CLIPScore/LPIPS/depth_or_geometry_metrics_protocol_dependent paper-specific 3D/panorama training source; verify exact construction needs_official_url needs_manual_check
Poly Haven Polyhaven hdr_environment_maps HDR environment maps no_or_needs_check lighting_or_hdr_environment_source FID/LPIPS/visual_quality_protocol_dependent HDRI/environment map source; verify exact subset and license needs_official_url needs_manual_check
RealEstate10K RealEstate10K real_video_view_synthesis indoor/outdoor real estate videos no_or_needs_check needs_manual_check needs_manual_check video/view-synthesis dataset; in this pipeline often demo or qualitative evidence needs_official_url needs_manual_check
SUN360 SUN360 real_panorama_image indoor/outdoor panoramas yes panorama_generation_train_or_eval FID/IS/CLIPScore/LPIPS/projection_specific_metrics classic 360 panorama dataset; often used for panorama generation evaluation needs_official_url needs_manual_check
ScanNet ScanNet real_3d_scan_reference_or_eval indoor scenes no_or_needs_check rendered_panorama_or_view_synthesis_source FID/CLIPScore/LPIPS/depth_or_geometry_metrics_protocol_dependent RGB-D indoor scans; verify whether ScanNet/ScanNet++ is reference, demo, or evaluation needs_official_url needs_manual_check
Structured3D Structure3D;Structured3D synthetic_indoor_panorama indoor scenes yes panorama_generation_train_or_eval FID/IS/CLIPScore/LPIPS/projection_specific_metrics synthetic structured indoor scenes with panoramic renderings needs_official_url needs_manual_check
iHDRI iHDRI hdr_environment_maps HDR environment maps no_or_needs_check lighting_or_hdr_environment_source FID/LPIPS/visual_quality_protocol_dependent HDRI/environment map source; verify exact subset and license needs_official_url needs_manual_check

Paper × Dataset × Role V2

V2 允许同一论文同一数据集拆成多行 role。只有 sota_eligible=yestest_eval/ood_eval/benchmark 行可进入 6.5 排名候选。

paper_id 论文 dataset role usage_phase polarity split sample_count table_id metric_variant comparable_group_id metric_linked sota_eligible sota_block_reason confidence
a1a69b14748af5b3 Taming Stable Diffusion for Text to 360° Panorama Image Generation Matterport3D reference_only reference_or_related_work bibliographic_or_related 10,800; 2,295 no no role=reference_only medium
a1a69b14748af5b3 Taming Stable Diffusion for Text to 360° Panorama Image Generation Matterport3D ood_eval ood_evaluation affirmed_or_ambiguous zero_shot no no missing_metric_table_link medium
a1a69b14748af5b3 Taming Stable Diffusion for Text to 360° Panorama Image Generation Matterport3D reference_only reference_or_related_work bibliographic_or_related no no role=reference_only medium
a1a69b14748af5b3 Taming Stable Diffusion for Text to 360° Panorama Image Generation LAION reference_only reference_or_related_work bibliographic_or_related train no no role=reference_only medium
a1a69b14748af5b3 Taming Stable Diffusion for Text to 360° Panorama Image Generation LAION pretrain_source foundation_model_pretrain affirmed_or_ambiguous train no no role=pretrain_source medium
a1a69b14748af5b3 Taming Stable Diffusion for Text to 360° Panorama Image Generation LAION caption_source caption_or_text_source affirmed_or_ambiguous train no no role=caption_source low
98bafd1887bf33aa Diffusion360: Seamless 360 Degree Panoramic Image Generation based on Diffusion Models SUN360 train main_model_train affirmed_or_ambiguous train no no role=train medium
98bafd1887bf33aa Diffusion360: Seamless 360 Degree Panoramic Image Generation based on Diffusion Models SUN360 train main_model_train affirmed_or_ambiguous no no role=train medium
7f4864044e04e5ec 360-Degree Panorama Generation from Few Unregistered NFoV Images SUN360 test_eval evaluation affirmed_or_ambiguous val no no missing_metric_table_link medium
7f4864044e04e5ec 360-Degree Panorama Generation from Few Unregistered NFoV Images SUN360 train main_model_train affirmed_or_ambiguous train;val 500 panoramas no no role=train medium
7f4864044e04e5ec 360-Degree Panorama Generation from Few Unregistered NFoV Images SUN360 train main_model_train affirmed_or_ambiguous train;val;test 500 panoramas no no role=train medium
7f4864044e04e5ec 360-Degree Panorama Generation from Few Unregistered NFoV Images SUN360 test_eval evaluation affirmed_or_ambiguous train;val;test 500 panoramas no no missing_metric_table_link medium
7f4864044e04e5ec 360-Degree Panorama Generation from Few Unregistered NFoV Images LAVAL Indoor train main_model_train affirmed_or_ambiguous train;val;test 500 panoramas no no role=train medium
7f4864044e04e5ec 360-Degree Panorama Generation from Few Unregistered NFoV Images LAVAL Indoor test_eval evaluation affirmed_or_ambiguous train;val;test 500 panoramas no no missing_metric_table_link medium
7f4864044e04e5ec 360-Degree Panorama Generation from Few Unregistered NFoV Images LAVAL Indoor reference_only reference_or_related_work bibliographic_or_related train;val;test 500 panoramas no no role=reference_only medium
7f4864044e04e5ec 360-Degree Panorama Generation from Few Unregistered NFoV Images LAVAL Indoor test_eval evaluation negated train;val;test 500 panoramas; 289 images no no missing_metric_table_link medium
7f4864044e04e5ec 360-Degree Panorama Generation from Few Unregistered NFoV Images LAVAL Indoor reference_only reference_or_related_work bibliographic_or_related val no no role=reference_only medium
7f4864044e04e5ec 360-Degree Panorama Generation from Few Unregistered NFoV Images LAION reference_only reference_or_related_work bibliographic_or_related train no no role=reference_only medium
7f4864044e04e5ec 360-Degree Panorama Generation from Few Unregistered NFoV Images ImageNet reference_only reference_or_related_work bibliographic_or_related train no no role=reference_only medium
79ab21dbadb25cf0 Panorama Generation From NFoV Image Done Right SUN360 train main_model_train affirmed_or_ambiguous train;val;zero_shot 3K no no role=train medium
79ab21dbadb25cf0 Panorama Generation From NFoV Image Done Right SUN360 ood_eval ood_evaluation affirmed_or_ambiguous train;val;zero_shot 3K table_2 CLIP-FID;Distort-FID;FID;Inception Score image_or_nfov_to_panorama|SUN360|CLIP-FID|cropped_region|SUN360;image_or_nfov_to_panorama|SUN360|Distort-FID|cropped_region|SUN360;image_or_nfov_to_panorama|SUN360|FID|cropped_region|SUN360;image_or_nfov_to_panorama|SUN360|Inception Score|cropped_region|SUN360 yes yes eligible_eval_with_metric medium
79ab21dbadb25cf0 Panorama Generation From NFoV Image Done Right SUN360 pretrain_source foundation_model_pretrain affirmed_or_ambiguous train;val;zero_shot 3K no no role=pretrain_source medium
79ab21dbadb25cf0 Panorama Generation From NFoV Image Done Right SUN360 test_eval evaluation affirmed_or_ambiguous val table_2 CLIP-FID;Distort-FID;FID;Inception Score image_or_nfov_to_panorama|SUN360|CLIP-FID|cropped_region|SUN360;image_or_nfov_to_panorama|SUN360|Distort-FID|cropped_region|SUN360;image_or_nfov_to_panorama|SUN360|FID|cropped_region|SUN360;image_or_nfov_to_panorama|SUN360|Inception Score|cropped_region|SUN360 yes yes eligible_eval_with_metric medium
79ab21dbadb25cf0 Panorama Generation From NFoV Image Done Right SUN360 pretrain_source foundation_model_pretrain affirmed_or_ambiguous val no no role=pretrain_source medium
79ab21dbadb25cf0 Panorama Generation From NFoV Image Done Right SUN360 train main_model_train affirmed_or_ambiguous train;test no no role=train medium
79ab21dbadb25cf0 Panorama Generation From NFoV Image Done Right SUN360 test_eval evaluation affirmed_or_ambiguous train;test table_2 CLIP-FID;Distort-FID;FID;Inception Score image_or_nfov_to_panorama|SUN360|CLIP-FID|cropped_region|SUN360;image_or_nfov_to_panorama|SUN360|Distort-FID|cropped_region|SUN360;image_or_nfov_to_panorama|SUN360|FID|cropped_region|SUN360;image_or_nfov_to_panorama|SUN360|Inception Score|cropped_region|SUN360 yes yes eligible_eval_with_metric medium
79ab21dbadb25cf0 Panorama Generation From NFoV Image Done Right SUN360 caption_source caption_or_text_source affirmed_or_ambiguous train;test no no role=caption_source low
79ab21dbadb25cf0 Panorama Generation From NFoV Image Done Right SUN360 pretrain_source foundation_model_pretrain affirmed_or_ambiguous train;test no no role=pretrain_source medium
79ab21dbadb25cf0 Panorama Generation From NFoV Image Done Right LAVAL Indoor train main_model_train affirmed_or_ambiguous train;val;zero_shot 3K no no role=train medium
79ab21dbadb25cf0 Panorama Generation From NFoV Image Done Right LAVAL Indoor ood_eval ood_evaluation affirmed_or_ambiguous train;val;zero_shot 3K table_2 CLIP-FID;Distort-FID;FID;Inception Score image_or_nfov_to_panorama|LAVAL Indoor|CLIP-FID|cropped_region|Laval_Indoor;image_or_nfov_to_panorama|LAVAL Indoor|Distort-FID|cropped_region|Laval_Indoor;image_or_nfov_to_panorama|LAVAL Indoor|FID|cropped_region|Laval_Indoor;image_or_nfov_to_panorama|LAVAL Indoor|Inception Score|cropped_region|Laval_Indoor yes yes eligible_eval_with_metric medium
79ab21dbadb25cf0 Panorama Generation From NFoV Image Done Right LAVAL Indoor pretrain_source foundation_model_pretrain affirmed_or_ambiguous train;val;zero_shot 3K no no role=pretrain_source medium
79ab21dbadb25cf0 Panorama Generation From NFoV Image Done Right LAVAL Indoor train main_model_train affirmed_or_ambiguous train;val no no role=train medium
79ab21dbadb25cf0 Panorama Generation From NFoV Image Done Right LAVAL Indoor test_eval evaluation affirmed_or_ambiguous train;val table_2 CLIP-FID;Distort-FID;FID;Inception Score image_or_nfov_to_panorama|LAVAL Indoor|CLIP-FID|cropped_region|Laval_Indoor;image_or_nfov_to_panorama|LAVAL Indoor|Distort-FID|cropped_region|Laval_Indoor;image_or_nfov_to_panorama|LAVAL Indoor|FID|cropped_region|Laval_Indoor;image_or_nfov_to_panorama|LAVAL Indoor|Inception Score|cropped_region|Laval_Indoor yes yes eligible_eval_with_metric medium
79ab21dbadb25cf0 Panorama Generation From NFoV Image Done Right LAVAL Indoor pretrain_source foundation_model_pretrain affirmed_or_ambiguous train;val no no role=pretrain_source medium
79ab21dbadb25cf0 Panorama Generation From NFoV Image Done Right LAVAL Indoor reference_only reference_or_related_work bibliographic_or_related train;val;test no no role=reference_only medium
79ab21dbadb25cf0 Panorama Generation From NFoV Image Done Right LAVAL Indoor test_eval evaluation affirmed_or_ambiguous val;test table_2 CLIP-FID;Distort-FID;FID;Inception Score image_or_nfov_to_panorama|LAVAL Indoor|CLIP-FID|cropped_region|Laval_Indoor;image_or_nfov_to_panorama|LAVAL Indoor|Distort-FID|cropped_region|Laval_Indoor;image_or_nfov_to_panorama|LAVAL Indoor|FID|cropped_region|Laval_Indoor;image_or_nfov_to_panorama|LAVAL Indoor|Inception Score|cropped_region|Laval_Indoor yes yes eligible_eval_with_metric medium
ba47839b8289ce8e PanoDiffusion: 360-degree Panorama Outpainting via Diffusion Structured3D test_eval evaluation affirmed_or_ambiguous val table_2 FID;sFID panorama_outpainting|Structured3D|FID|erp_or_full_panorama|Camera_Mask;panorama_outpainting|Structured3D|FID|erp_or_full_panorama|Layout_Mask;panorama_outpainting|Structured3D|FID|erp_or_full_panorama|NFoV_Mask;panorama_outpainting|Structured3D|FID|erp_or_full_panorama|Random_Box_Mask;panorama_outpainting|Structured3D|sFID|erp_or_full_panorama|Camera_Mask;panorama_outpainting|Structured3D|sFID|erp_or_full_panorama|Layout_Mask;panorama_outpainting|Structured3D|sFID|erp_or_full_panorama|NFoV_Mask;panorama_outpainting|Structured3D|sFID|erp_or_full_panorama|Random_Box_Mask yes yes eligible_eval_with_metric medium
ba47839b8289ce8e PanoDiffusion: 360-degree Panorama Outpainting via Diffusion Structured3D demo_input demo_or_qualitative affirmed_or_ambiguous val no no role=demo_input low
ba47839b8289ce8e PanoDiffusion: 360-degree Panorama Outpainting via Diffusion Structured3D train main_model_train affirmed_or_ambiguous train;val no no role=train medium
ba47839b8289ce8e PanoDiffusion: 360-degree Panorama Outpainting via Diffusion Structured3D train main_model_train affirmed_or_ambiguous train no no role=train medium
ba47839b8289ce8e PanoDiffusion: 360-degree Panorama Outpainting via Diffusion Structured3D test_eval evaluation affirmed_or_ambiguous train table_2 FID;sFID panorama_outpainting|Structured3D|FID|erp_or_full_panorama|Camera_Mask;panorama_outpainting|Structured3D|FID|erp_or_full_panorama|Layout_Mask;panorama_outpainting|Structured3D|FID|erp_or_full_panorama|NFoV_Mask;panorama_outpainting|Structured3D|FID|erp_or_full_panorama|Random_Box_Mask;panorama_outpainting|Structured3D|sFID|erp_or_full_panorama|Camera_Mask;panorama_outpainting|Structured3D|sFID|erp_or_full_panorama|Layout_Mask;panorama_outpainting|Structured3D|sFID|erp_or_full_panorama|NFoV_Mask;panorama_outpainting|Structured3D|sFID|erp_or_full_panorama|Random_Box_Mask yes yes eligible_eval_with_metric medium
9753801176957726353 What Makes for Text to 360-degree Panorama Generation with Stable Diffusion? Matterport3D train main_model_train affirmed_or_ambiguous train 10,800 no no role=train medium
9753801176957726353 What Makes for Text to 360-degree Panorama Generation with Stable Diffusion? Matterport3D reference_only reference_or_related_work bibliographic_or_related 10,800 no no role=reference_only medium
9753801176957726353 What Makes for Text to 360-degree Panorama Generation with Stable Diffusion? Matterport3D mentioned_only mentioned_only affirmed_or_ambiguous no no role=mentioned_only low
9753801176957726353 What Makes for Text to 360-degree Panorama Generation with Stable Diffusion? LAION pretrain_source foundation_model_pretrain affirmed_or_ambiguous train no no role=pretrain_source medium
9753801176957726353 What Makes for Text to 360-degree Panorama Generation with Stable Diffusion? LAION caption_source caption_or_text_source affirmed_or_ambiguous train no no role=caption_source low
1507614108164de8 DiffPano++: Scalable and Consistent Multi-View Panorama Generation with Spherical Epipolar-Aware Diffusion Matterport3D reference_only reference_or_related_work bibliographic_or_related no no role=reference_only medium
1507614108164de8 DiffPano++: Scalable and Consistent Multi-View Panorama Generation with Spherical Epipolar-Aware Diffusion Matterport3D reference_only reference_or_related_work bibliographic_or_related 1000 scenes no no role=reference_only medium
1507614108164de8 DiffPano++: Scalable and Consistent Multi-View Panorama Generation with Spherical Epipolar-Aware Diffusion Matterport3D reference_only reference_or_related_work bibliographic_or_related 1000 scenes no no role=reference_only medium
1507614108164de8 DiffPano++: Scalable and Consistent Multi-View Panorama Generation with Spherical Epipolar-Aware Diffusion Structured3D reference_only reference_or_related_work bibliographic_or_related no no role=reference_only medium
1507614108164de8 DiffPano++: Scalable and Consistent Multi-View Panorama Generation with Spherical Epipolar-Aware Diffusion Structured3D mentioned_only mentioned_only affirmed_or_ambiguous no no role=mentioned_only low
1507614108164de8 DiffPano++: Scalable and Consistent Multi-View Panorama Generation with Spherical Epipolar-Aware Diffusion HM3D reference_only reference_or_related_work bibliographic_or_related 1000 scenes no no role=reference_only medium
1507614108164de8 DiffPano++: Scalable and Consistent Multi-View Panorama Generation with Spherical Epipolar-Aware Diffusion HM3D reference_only reference_or_related_work bibliographic_or_related 1000 scenes no no role=reference_only medium
1507614108164de8 DiffPano++: Scalable and Consistent Multi-View Panorama Generation with Spherical Epipolar-Aware Diffusion HM3D derived_rendered_dataset derived_render_source affirmed_or_ambiguous no no role=derived_rendered_dataset medium
9033796522063612996 Spherical manifold guided diffusion model for panoramic image generation Matterport3D train main_model_train affirmed_or_ambiguous val 10,912 no no role=train medium
9033796522063612996 Spherical manifold guided diffusion model for panoramic image generation Matterport3D test_eval evaluation affirmed_or_ambiguous val 10,912 no no missing_metric_table_link medium
9033796522063612996 Spherical manifold guided diffusion model for panoramic image generation Matterport3D reference_only reference_or_related_work bibliographic_or_related zero_shot no no role=reference_only medium
9033796522063612996 Spherical manifold guided diffusion model for panoramic image generation Structured3D reference_only reference_or_related_work bibliographic_or_related no no role=reference_only medium
9033796522063612996 Spherical manifold guided diffusion model for panoramic image generation COCO reference_only reference_or_related_work bibliographic_or_related no no role=reference_only medium
13521719276910748592 Spherical-nested diffusion model for panoramic image outpainting Matterport3D test_eval evaluation affirmed_or_ambiguous val table_1;table_2;table_3;table_5;table_6 FID;FID_hori panorama_outpainting|Matterport3D|FID_hori|erp_or_full_panorama|Matterport3D;panorama_outpainting|Matterport3D|FID_hori|erp_or_full_panorama|scope_unknown;panorama_outpainting|Matterport3D|FID|erp_or_full_panorama|Matterport3D;panorama_outpainting|Matterport3D|FID|erp_or_full_panorama|scope_unknown yes yes eligible_eval_with_metric medium
13521719276910748592 Spherical-nested diffusion model for panoramic image outpainting Matterport3D train main_model_train affirmed_or_ambiguous train;val 820 images; 0912 images no no role=train medium
13521719276910748592 Spherical-nested diffusion model for panoramic image outpainting Matterport3D test_eval evaluation affirmed_or_ambiguous train;val 820 images; 0912 images table_1;table_2;table_3;table_5;table_6 FID;FID_hori panorama_outpainting|Matterport3D|FID_hori|erp_or_full_panorama|Matterport3D;panorama_outpainting|Matterport3D|FID_hori|erp_or_full_panorama|scope_unknown;panorama_outpainting|Matterport3D|FID|erp_or_full_panorama|Matterport3D;panorama_outpainting|Matterport3D|FID|erp_or_full_panorama|scope_unknown yes yes eligible_eval_with_metric medium
13521719276910748592 Spherical-nested diffusion model for panoramic image outpainting Matterport3D train main_model_train affirmed_or_ambiguous train no no role=train medium
13521719276910748592 Spherical-nested diffusion model for panoramic image outpainting Matterport3D test_eval evaluation affirmed_or_ambiguous train table_1;table_2;table_3;table_5;table_6 FID;FID_hori panorama_outpainting|Matterport3D|FID_hori|erp_or_full_panorama|Matterport3D;panorama_outpainting|Matterport3D|FID_hori|erp_or_full_panorama|scope_unknown;panorama_outpainting|Matterport3D|FID|erp_or_full_panorama|Matterport3D;panorama_outpainting|Matterport3D|FID|erp_or_full_panorama|scope_unknown yes yes eligible_eval_with_metric medium
13521719276910748592 Spherical-nested diffusion model for panoramic image outpainting Matterport3D caption_source caption_or_text_source affirmed_or_ambiguous train no no role=caption_source low
13521719276910748592 Spherical-nested diffusion model for panoramic image outpainting Structured3D test_eval evaluation affirmed_or_ambiguous val 820 images table_1 FID;FID_hori panorama_outpainting|Structured3D|FID_hori|erp_or_full_panorama|Structured3D;panorama_outpainting|Structured3D|FID|erp_or_full_panorama|Structured3D yes yes eligible_eval_with_metric medium
13521719276910748592 Spherical-nested diffusion model for panoramic image outpainting Structured3D train main_model_train affirmed_or_ambiguous train;val 820 images; 0912 images; 21,133; 19,019 no no role=train medium
13521719276910748592 Spherical-nested diffusion model for panoramic image outpainting Structured3D test_eval evaluation affirmed_or_ambiguous train;val 820 images; 0912 images; 21,133; 19,019 table_1 FID;FID_hori panorama_outpainting|Structured3D|FID_hori|erp_or_full_panorama|Structured3D;panorama_outpainting|Structured3D|FID|erp_or_full_panorama|Structured3D yes yes eligible_eval_with_metric medium
13521719276910748592 Spherical-nested diffusion model for panoramic image outpainting Structured3D train main_model_train affirmed_or_ambiguous train no no role=train medium
13521719276910748592 Spherical-nested diffusion model for panoramic image outpainting Structured3D test_eval evaluation affirmed_or_ambiguous train table_1 FID;FID_hori panorama_outpainting|Structured3D|FID_hori|erp_or_full_panorama|Structured3D;panorama_outpainting|Structured3D|FID|erp_or_full_panorama|Structured3D yes yes eligible_eval_with_metric medium
13521719276910748592 Spherical-nested diffusion model for panoramic image outpainting Structured3D caption_source caption_or_text_source affirmed_or_ambiguous train no no role=caption_source low
13521719276910748592 Spherical-nested diffusion model for panoramic image outpainting COCO reference_only reference_or_related_work affirmed_or_ambiguous no no role=reference_only medium
983cfc7c8bda0d9e CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation Structured3D reference_only reference_or_related_work bibliographic_or_related train 48000 panoramas no no role=reference_only medium
983cfc7c8bda0d9e CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation Structured3D reference_only reference_or_related_work bibliographic_or_related no no role=reference_only medium
983cfc7c8bda0d9e CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation Structured3D train main_model_train affirmed_or_ambiguous train 700 panoramas; 20,000; 40,000 no no role=train medium
983cfc7c8bda0d9e CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation Structured3D demo_input demo_or_qualitative affirmed_or_ambiguous train 700 panoramas; 20,000; 40,000 no no role=demo_input low
983cfc7c8bda0d9e CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation SUN360 test_eval evaluation affirmed_or_ambiguous val;test 1000 panoramas table_1;table_2 CLIP-FID;CLIPScore;FAED;FID;KID image_or_nfov_to_panorama|SUN360|CLIP-FID|cubemap|SUN360;image_or_nfov_to_panorama|SUN360|CLIP-FID|erp_or_full_panorama|SUN360;image_or_nfov_to_panorama|SUN360|CLIPScore|erp_or_full_panorama|SUN360;image_or_nfov_to_panorama|SUN360|FAED|erp_or_full_panorama|SUN360;image_or_nfov_to_panorama|SUN360|FID|cubemap|SUN360;image_or_nfov_to_panorama|SUN360|FID|erp_or_full_panorama|SUN360;image_or_nfov_to_panorama|SUN360|KID|cubemap|SUN360;image_or_nfov_to_panorama|SUN360|KID|erp_or_full_panorama|SUN360 yes yes eligible_eval_with_metric medium
983cfc7c8bda0d9e CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation SUN360 caption_source caption_or_text_source affirmed_or_ambiguous val;test 1000 panoramas no no role=caption_source low
983cfc7c8bda0d9e CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation SUN360 train main_model_train affirmed_or_ambiguous train;val 1000 panoramas no no role=train medium
983cfc7c8bda0d9e CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation SUN360 test_eval evaluation affirmed_or_ambiguous train;val 1000 panoramas table_1;table_2 CLIP-FID;CLIPScore;FAED;FID;KID image_or_nfov_to_panorama|SUN360|CLIP-FID|cubemap|SUN360;image_or_nfov_to_panorama|SUN360|CLIP-FID|erp_or_full_panorama|SUN360;image_or_nfov_to_panorama|SUN360|CLIPScore|erp_or_full_panorama|SUN360;image_or_nfov_to_panorama|SUN360|FAED|erp_or_full_panorama|SUN360;image_or_nfov_to_panorama|SUN360|FID|cubemap|SUN360;image_or_nfov_to_panorama|SUN360|FID|erp_or_full_panorama|SUN360;image_or_nfov_to_panorama|SUN360|KID|cubemap|SUN360;image_or_nfov_to_panorama|SUN360|KID|erp_or_full_panorama|SUN360 yes yes eligible_eval_with_metric medium
983cfc7c8bda0d9e CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation SUN360 train main_model_train affirmed_or_ambiguous train;val 1000 panoramas no no role=train medium
983cfc7c8bda0d9e CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation SUN360 test_eval evaluation affirmed_or_ambiguous train;val 1000 panoramas table_1;table_2 CLIP-FID;CLIPScore;FAED;FID;KID image_or_nfov_to_panorama|SUN360|CLIP-FID|cubemap|SUN360;image_or_nfov_to_panorama|SUN360|CLIP-FID|erp_or_full_panorama|SUN360;image_or_nfov_to_panorama|SUN360|CLIPScore|erp_or_full_panorama|SUN360;image_or_nfov_to_panorama|SUN360|FAED|erp_or_full_panorama|SUN360;image_or_nfov_to_panorama|SUN360|FID|cubemap|SUN360;image_or_nfov_to_panorama|SUN360|FID|erp_or_full_panorama|SUN360;image_or_nfov_to_panorama|SUN360|KID|cubemap|SUN360;image_or_nfov_to_panorama|SUN360|KID|erp_or_full_panorama|SUN360 yes yes eligible_eval_with_metric medium
983cfc7c8bda0d9e CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation LAVAL Indoor test_eval evaluation affirmed_or_ambiguous val;test table_1;table_2 CLIP-FID;CLIPScore;FAED;FID;KID image_or_nfov_to_panorama|LAVAL Indoor|CLIP-FID|cubemap|LAVAL_Indoor;image_or_nfov_to_panorama|LAVAL Indoor|CLIP-FID|erp_or_full_panorama|LAVAL_Indoor;image_or_nfov_to_panorama|LAVAL Indoor|CLIPScore|erp_or_full_panorama|LAVAL_Indoor;image_or_nfov_to_panorama|LAVAL Indoor|FAED|erp_or_full_panorama|LAVAL_Indoor;image_or_nfov_to_panorama|LAVAL Indoor|FID|cubemap|LAVAL_Indoor;image_or_nfov_to_panorama|LAVAL Indoor|FID|erp_or_full_panorama|LAVAL_Indoor;image_or_nfov_to_panorama|LAVAL Indoor|KID|cubemap|LAVAL_Indoor;image_or_nfov_to_panorama|LAVAL Indoor|KID|erp_or_full_panorama|LAVAL_Indoor yes yes eligible_eval_with_metric medium
983cfc7c8bda0d9e CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation LAVAL Indoor caption_source caption_or_text_source affirmed_or_ambiguous val;test no no role=caption_source low
983cfc7c8bda0d9e CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation LAVAL Indoor test_eval evaluation affirmed_or_ambiguous val;test 1000 panoramas table_1;table_2 CLIP-FID;CLIPScore;FAED;FID;KID image_or_nfov_to_panorama|LAVAL Indoor|CLIP-FID|cubemap|LAVAL_Indoor;image_or_nfov_to_panorama|LAVAL Indoor|CLIP-FID|erp_or_full_panorama|LAVAL_Indoor;image_or_nfov_to_panorama|LAVAL Indoor|CLIPScore|erp_or_full_panorama|LAVAL_Indoor;image_or_nfov_to_panorama|LAVAL Indoor|FAED|erp_or_full_panorama|LAVAL_Indoor;image_or_nfov_to_panorama|LAVAL Indoor|FID|cubemap|LAVAL_Indoor;image_or_nfov_to_panorama|LAVAL Indoor|FID|erp_or_full_panorama|LAVAL_Indoor;image_or_nfov_to_panorama|LAVAL Indoor|KID|cubemap|LAVAL_Indoor;image_or_nfov_to_panorama|LAVAL Indoor|KID|erp_or_full_panorama|LAVAL_Indoor yes yes eligible_eval_with_metric medium
983cfc7c8bda0d9e CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation LAVAL Indoor caption_source caption_or_text_source affirmed_or_ambiguous val;test 1000 panoramas no no role=caption_source low
983cfc7c8bda0d9e CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation LAVAL Indoor mentioned_only mentioned_only affirmed_or_ambiguous val no no role=mentioned_only low
983cfc7c8bda0d9e CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation Pano360 reference_only reference_or_related_work bibliographic_or_related train 48000 panoramas no no role=reference_only medium
983cfc7c8bda0d9e CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation Poly Haven reference_only reference_or_related_work bibliographic_or_related train 48000 panoramas no no role=reference_only medium
983cfc7c8bda0d9e CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation Poly Haven reference_only reference_or_related_work bibliographic_or_related train 48000 panoramas no no role=reference_only medium
983cfc7c8bda0d9e CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation Poly Haven reference_only reference_or_related_work bibliographic_or_related train no no role=reference_only medium
983cfc7c8bda0d9e CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation Humus reference_only reference_or_related_work bibliographic_or_related train 48000 panoramas no no role=reference_only medium
983cfc7c8bda0d9e CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation Humus reference_only reference_or_related_work bibliographic_or_related train 48000 panoramas no no role=reference_only medium
983cfc7c8bda0d9e CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation Humus reference_only reference_or_related_work bibliographic_or_related train no no role=reference_only medium
983cfc7c8bda0d9e CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation ImageNet reference_only reference_or_related_work bibliographic_or_related no no role=reference_only medium
95bf3d1227a9198f Conditional Panoramic Image Generation via Masked Autoregressive Modeling Matterport3D test_eval evaluation affirmed_or_ambiguous table_1;table_2;table_7;table_8 CLIPScore;DS;FAED;FID panorama_generation_general|Matterport3D|CLIPScore|erp_or_full_panorama|scope_unknown;panorama_generation_general|Matterport3D|DS|erp_or_full_panorama|scope_unknown;panorama_generation_general|Matterport3D|FAED|erp_or_full_panorama|scope_unknown;panorama_generation_general|Matterport3D|FID|erp_or_full_panorama|scope_unknown yes yes eligible_eval_with_metric medium
95bf3d1227a9198f Conditional Panoramic Image Generation via Masked Autoregressive Modeling Matterport3D demo_input demo_or_qualitative affirmed_or_ambiguous no no role=demo_input low
95bf3d1227a9198f Conditional Panoramic Image Generation via Masked Autoregressive Modeling Matterport3D reference_only reference_or_related_work bibliographic_or_related train;val no no role=reference_only medium
95bf3d1227a9198f Conditional Panoramic Image Generation via Masked Autoregressive Modeling Matterport3D test_eval evaluation affirmed_or_ambiguous table_1;table_2;table_7;table_8 CLIPScore;DS;FAED;FID panorama_generation_general|Matterport3D|CLIPScore|erp_or_full_panorama|scope_unknown;panorama_generation_general|Matterport3D|DS|erp_or_full_panorama|scope_unknown;panorama_generation_general|Matterport3D|FAED|erp_or_full_panorama|scope_unknown;panorama_generation_general|Matterport3D|FID|erp_or_full_panorama|scope_unknown yes no qualitative_or_demo_context medium
95bf3d1227a9198f Conditional Panoramic Image Generation via Masked Autoregressive Modeling Matterport3D demo_input demo_or_qualitative affirmed_or_ambiguous no no role=demo_input low
95bf3d1227a9198f Conditional Panoramic Image Generation via Masked Autoregressive Modeling Structured3D reference_only reference_or_related_work bibliographic_or_related no no role=reference_only medium
95bf3d1227a9198f Conditional Panoramic Image Generation via Masked Autoregressive Modeling Structured3D train main_model_train affirmed_or_ambiguous train;test 9000 images no no role=train medium
95bf3d1227a9198f Conditional Panoramic Image Generation via Masked Autoregressive Modeling Structured3D test_eval evaluation affirmed_or_ambiguous train;test 9000 images table_10;table_11 CLIPScore;DS;FID panorama_generation_general|Structured3D|CLIPScore|erp_or_full_panorama|scope_unknown;panorama_generation_general|Structured3D|DS|erp_or_full_panorama|scope_unknown;panorama_generation_general|Structured3D|FID|erp_or_full_panorama|scope_unknown yes yes eligible_eval_with_metric medium
95bf3d1227a9198f Conditional Panoramic Image Generation via Masked Autoregressive Modeling Structured3D train main_model_train affirmed_or_ambiguous train;zero_shot 100 images no no role=train medium
95bf3d1227a9198f Conditional Panoramic Image Generation via Masked Autoregressive Modeling Structured3D ood_eval ood_evaluation affirmed_or_ambiguous train;zero_shot 100 images table_10;table_11 CLIPScore;DS;FID panorama_generation_general|Structured3D|CLIPScore|erp_or_full_panorama|scope_unknown;panorama_generation_general|Structured3D|DS|erp_or_full_panorama|scope_unknown;panorama_generation_general|Structured3D|FID|erp_or_full_panorama|scope_unknown yes yes eligible_eval_with_metric medium
95bf3d1227a9198f Conditional Panoramic Image Generation via Masked Autoregressive Modeling Structured3D caption_source caption_or_text_source affirmed_or_ambiguous train;zero_shot 100 images no no role=caption_source low
95bf3d1227a9198f Conditional Panoramic Image Generation via Masked Autoregressive Modeling SUN360 train main_model_train affirmed_or_ambiguous train;val no no role=train medium
95bf3d1227a9198f Conditional Panoramic Image Generation via Masked Autoregressive Modeling SUN360 test_eval evaluation affirmed_or_ambiguous train;val no no missing_metric_table_link medium
95bf3d1227a9198f Conditional Panoramic Image Generation via Masked Autoregressive Modeling SUN360 ood_eval ood_evaluation affirmed_or_ambiguous train;val no no missing_metric_table_link medium
95bf3d1227a9198f Conditional Panoramic Image Generation via Masked Autoregressive Modeling SUN360 reference_only reference_or_related_work bibliographic_or_related train;val no no role=reference_only medium
95bf3d1227a9198f Conditional Panoramic Image Generation via Masked Autoregressive Modeling SUN360 train main_model_train affirmed_or_ambiguous train no no role=train medium
95bf3d1227a9198f Conditional Panoramic Image Generation via Masked Autoregressive Modeling SUN360 ood_eval ood_evaluation affirmed_or_ambiguous train no no missing_metric_table_link medium
29833d4576d49165 DiffPano: Scalable and Consistent Text to Panorama Generation with Spherical Epipolar-Aware Diffusion Matterport3D reference_only reference_or_related_work bibliographic_or_related no no role=reference_only medium
29833d4576d49165 DiffPano: Scalable and Consistent Text to Panorama Generation with Spherical Epipolar-Aware Diffusion Matterport3D reference_only reference_or_related_work bibliographic_or_related 1000 scenes no no role=reference_only medium
29833d4576d49165 DiffPano: Scalable and Consistent Text to Panorama Generation with Spherical Epipolar-Aware Diffusion Matterport3D reference_only reference_or_related_work bibliographic_or_related 1000 scenes no no role=reference_only medium
29833d4576d49165 DiffPano: Scalable and Consistent Text to Panorama Generation with Spherical Epipolar-Aware Diffusion Structured3D reference_only reference_or_related_work bibliographic_or_related no no role=reference_only medium
29833d4576d49165 DiffPano: Scalable and Consistent Text to Panorama Generation with Spherical Epipolar-Aware Diffusion Structured3D mentioned_only mentioned_only affirmed_or_ambiguous no no role=mentioned_only low
29833d4576d49165 DiffPano: Scalable and Consistent Text to Panorama Generation with Spherical Epipolar-Aware Diffusion HM3D reference_only reference_or_related_work bibliographic_or_related 1000 scenes no no role=reference_only medium
29833d4576d49165 DiffPano: Scalable and Consistent Text to Panorama Generation with Spherical Epipolar-Aware Diffusion HM3D reference_only reference_or_related_work bibliographic_or_related 1000 scenes no no role=reference_only medium
29833d4576d49165 DiffPano: Scalable and Consistent Text to Panorama Generation with Spherical Epipolar-Aware Diffusion HM3D derived_rendered_dataset derived_render_source affirmed_or_ambiguous no no role=derived_rendered_dataset medium

Role 统计

role rows
reference_only 83
test_eval 54
train 49
pretrain_source 15
caption_source 13
mentioned_only 11
demo_input 10
ood_eval 9
derived_rendered_dataset 2

数据集汇总

数据集 mention_count role_candidates aliases 代表论文 说明状态
COCO 4 reference_only COCO 360dvd: Controllable panorama video generation with 360-degree video diffusion model; SphereDrag: Spherical Geometry-Aware Panoramic Image Editing; Spherical manifold guided diffusion model for panoramic image generation; Spherical-nested diffusion model for panoramic image outpainting 论文 md 证据抽取;角色为启发式 QA,需精读复核
Gibson 1 reference_only Gibson Top2Pano: Learning to Generate Indoor Panoramas from Top-Down View 论文 md 证据抽取;角色为启发式 QA,需精读复核
HDRI-Skies 1 reference_only HDRI-Skies DreamCube: 3D Panorama Generation via Multi-plane Synchronization 论文 md 证据抽取;角色为启发式 QA,需精读复核
HM3D 2 reference_only HM3D DiffPano++: Scalable and Consistent Multi-View Panorama Generation with Spherical Epipolar-Aware Diffusion; DiffPano: Scalable and Consistent Text to Panorama Generation with Spherical Epipolar-Aware Diffusion 论文 md 证据抽取;角色为启发式 QA,需精读复核
Humus 3 reference_only Humus 360Anything: Geometry-Free Lifting of Images and Videos to 360°; CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation; DreamCube: 3D Panorama Generation via Multi-plane Synchronization 论文 md 证据抽取;角色为启发式 QA,需精读复核
ImageNet 2 reference_only ImageNet 360-Degree Panorama Generation from Few Unregistered NFoV Images; CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation 论文 md 证据抽取;角色为启发式 QA,需精读复核
LAION 4 pretrain_source; reference_only LAION 360-Degree Panorama Generation from Few Unregistered NFoV Images; Taming Stable Diffusion for Text to 360° Panorama Image Generation; Twindiffusion: Enhancing coherence and efficiency in panoramic image generation with diffusion models; What Makes for Text to 360-degree Panorama Generation with Stable Diffusion? 论文 md 证据抽取;角色为启发式 QA,需精读复核
LAVAL Indoor 12 pretrain_source; reference_only; test_eval LAVAL; LAVAL Indoor; Laval Indoor 360-Degree Panorama Generation from Few Unregistered NFoV Images; 360Anything: Geometry-Free Lifting of Images and Videos to 360°; CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation; Panorama Generation From NFoV Image Done Right 论文 md 证据抽取;角色为启发式 QA,需精读复核
Matterport3D 15 mentioned_only; pretrain_source; reference_only; test_eval Matterport3D CamFreeDiff: camera-free image to panorama generation with diffusion model; Conditional Panoramic Image Generation via Masked Autoregressive Modeling; DiffPano++: Scalable and Consistent Multi-View Panorama Generation with Spherical Epipolar-Aware Diffusion; DiffPano: Scalable and Consistent Text to Panorama Generation with Spherical Epipolar-Aware Diffusion; JoPano: Unified Panorama Generation via Joint Modeling 论文 md 证据抽取;角色为启发式 QA,需精读复核
Pano360 3 reference_only Pano360 360Anything: Geometry-Free Lifting of Images and Videos to 360°; CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation; DreamCube: 3D Panorama Generation via Multi-plane Synchronization 论文 md 证据抽取;角色为启发式 QA,需精读复核
Pano3D 1 reference_only Pano3D Omni2: Unifying Omnidirectional Image Generation and Editing in an Omni Model 论文 md 证据抽取;角色为启发式 QA,需精读复核
Poly Haven 4 reference_only Polyhaven 360Anything: Geometry-Free Lifting of Images and Videos to 360°; CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation; DreamCube: 3D Panorama Generation via Multi-plane Synchronization; TanDiT: Tangent-Plane Diffusion Transformer for High-Quality 360 {\deg} Panorama Generation 论文 md 证据抽取;角色为启发式 QA,需精读复核
RealEstate10K 1 test_eval RealEstate10K 360Anything: Geometry-Free Lifting of Images and Videos to 360° 论文 md 证据抽取;角色为启发式 QA,需精读复核
SUN360 9 pretrain_source; reference_only; test_eval SUN360 360-Degree Panorama Generation from Few Unregistered NFoV Images; 360Anything: Geometry-Free Lifting of Images and Videos to 360°; Conditional Panoramic Image Generation via Masked Autoregressive Modeling; CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation; Diffusion360: Seamless 360 Degree Panoramic Image Generation based on Diffusion Models 论文 md 证据抽取;角色为启发式 QA,需精读复核
ScanNet 2 test_eval ScanNet 360Anything: Geometry-Free Lifting of Images and Videos to 360°; MVDiffusion: Enabling Holistic Multi-view Image Generation with Correspondence-Aware Diffusion 论文 md 证据抽取;角色为启发式 QA,需精读复核
Structured3D 15 mentioned_only; pretrain_source; reference_only; test_eval Structure3D; Structured3D 360Anything: Geometry-Free Lifting of Images and Videos to 360°; CamFreeDiff: camera-free image to panorama generation with diffusion model; Conditional Panoramic Image Generation via Masked Autoregressive Modeling; CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation; DiffPano++: Scalable and Consistent Multi-View Panorama Generation with Spherical Epipolar-Aware Diffusion 论文 md 证据抽取;角色为启发式 QA,需精读复核
iHDRI 1 reference_only iHDRI DreamCube: 3D Panorama Generation via Multi-plane Synchronization 论文 md 证据抽取;角色为启发式 QA,需精读复核

数据集介绍与证据

COCO

介绍:下一轮用 web search / 官方数据集资料补充。

论文证据:

  • Spherical manifold guided diffusion model for panoramic image generation [reference_only/medium]: ncoders and large language models. In International conference on machine learning, pages 19730–19742. PMLR, 2023. 5 [26] Chieh Hubert Lin, Chia-Che Chang, Yu-Sheng Chen, Da-Cheng Juan, Wei Wei, and Hwann-Tzong Chen. Coco-gan: Generation by parts via conditional coordinating. In Proceedings of the IEEE/CVF international conference on computer vision, pages 4512–4521, 2019. 5 [27] I Loshchilov. Decoupled weight dec...
  • Spherical-nested diffusion model for panoramic image outpainting [reference_only/high]: ., Wei, Y., and Zhao, Y. Cylin-painting: Seamless 360° panoramic image outpainting and beyond. IEEE Transactions on Image Processing, 33:382–394, 2024. Lin, C. H., Chang, C., Chen, Y., Juan, D., Wei, W., and Chen, H. COCO-GAN: generation by parts via conditional coordinating. In IEEE International Conference on Computer Vision (ICCV), 2019. Loshchilov, I. and Hutter, F. Decoupled weight decay regularization. In In...
  • SphereDrag: Spherical Geometry-Aware Panoramic Image Editing [reference_only/high]: n the contemporary panoramic image generation field, research methodologies are generally categorized into two main paradigms: GAN-based models and diffusion-based models. In the domain of GAN-based [6, 11] approaches, COCO-GAN [15] introduces a coordinate-conditional framework that generates images in a divided manner, using spatial coordinates as a guiding signal for the generator to progressively synthesize loc... || , Qi, Z., Wang, G., Shan, Y., Li, X.: Sgat4pass: spherical geometryaware transformer for panoramic semantic segmentation. Proc. of IJCAI (2023) 15. Lin, C.H., Chang, C.C., Chen, Y.S., Juan, D.C., Wei, W., Chen, H.T.: Coco-gan: Generation by parts via conditional coordinating. In: Proc. CVPR. pp. 4512–4521 (2019) 16. Liu, H., Xu, C., Yang, Y., Zeng, L., He, S.: Drag your noise: Interactive point-based editing via d...
  • 360dvd: Controllable panorama video generation with 360-degree video diffusion model [reference_only/medium]: age understanding and generation. In International Conference on Machine Learning, pages 12888– 12900. PMLR, 2022. 4 [23] Chieh Hubert Lin, Chia-Che Chang, Yu-Sheng Chen, Da-Cheng Juan, Wei Wei, and Hwann-Tzong Chen. Coco-gan: Generation by parts via conditional coordinating. In Proceedings of the IEEE/CVF international conference on computer vision, pages 4512–4521, 2019. 3 [24] Chieh Hubert Lin, Hsin-Ying Lee, Y...

Gibson

介绍:下一轮用 web search / 官方数据集资料补充。

论文证据:

  • Top2Pano: Learning to Generate Indoor Panoramas from Top-Down View [reference_only/medium]: >ScenesFloorsPanoramasScenesFloorsPanoramasMatterport3D61127617714291405Gibson152203537939761672 Table 1. The numbers of scenes, floors, and panorama images in the training and testing sets of the... || 0.6163PanFusion[41]11.450.437285.740.6153Top2Pano (Ours)11.720.440930.840.6029GibsonSat2Density[28]+LDM[29]10.540.448084.330.6462Sat2Density[28]+ControlNet[42]10.970.458285.210... || d>79.530.6634Top2Pano (Ours)11.580.485128.680.6282 Table 2. Quantitative comparison with existing methods on the Matterport3D [3] and Gibson [36] datasets. # 4. Experiments # 4.1. Data Preparation For evaluation, we use the Matterport3D [3] and Gibson [36] datasets. Since no existing dataset provides both top-down views and high-q...

HDRI-Skies

介绍:下一轮用 web search / 官方数据集资料补充。

论文证据:

  • DreamCube: 3D Panorama Generation via Multi-plane Synchronization [reference_only/medium]: generalization capabilities across diverse environments, we construct a more comprehensive dataset by combining multiple publicly available sources, including Structured3D [64], Pano360 [25], Polyhaven [40], Humus [38], HDRI-Skies [13] and iHDRI [14]. This combined dataset encompasses a broad spectrum of both indoor and outdoor environments, resulting in more than 30,000 panoramic instances. This general setting a... || ang Guo, Sparsh Garg, S Mahdi H Miangoleh, Xinyu Huang, and Liu Ren. Depth any camera: Zero-shot metric depth estimation from any camera. arXiv preprint arXiv:2501.02464, 2025. 3, 7, 8 [13] hdri skies. HDRIs. https://hdri-skies.com/, accessed 02/2025. 6 [14] hdri skies. HDRIs. https://www.ihdri.com/hdri-skiesoutdoor/, accessed 02/2025. 6 [15] Jing He, Haodong Li, Wei Yin, Yixun Liang, Leheng Li, Kaiqiang Zhou, Hon...

HM3D

介绍:下一轮用 web search / 官方数据集资料补充。

论文证据:

  • DiffPano++: Scalable and Consistent Multi-View Panorama Generation with Spherical Epipolar-Aware Diffusion [reference_only/medium]: y small in scale and only have single-view panoramas, which cannot support multi-view panorama generation, except Matterport3D [4]. In addition, the sky box images in Matterport3D [4] contain only sparse views. Although HM3D [37] provides the textured mesh of 1000 scenes, it lacks the corresponding text description for each view. To generate multi-view panoramas, we render cube maps at each viewpoint in the 3D mes... || views. Although HM3D [37] provides the textured mesh of 1000 scenes, it lacks the corresponding text description for each view. To generate multi-view panoramas, we render cube maps at each viewpoint in the 3D meshes of HM3D, using the Habitat Simulator [41], and stitch them into panoramas. We generate complete text descriptions corresponding to the panoramas by using Blip2 [22] to create text descriptions for eac... || noramic images, and the corresponding text descriptions are not precise enough. To address these issues, we utilize the Habitat Simulator [41] to randomly select positions within the scenes of the Habitat Matterport 3D (HM3D) [37] dataset and render the six-face cube maps. These cube maps are then interpolated and stitched together to form panoramas so we can obtain panoramas with clear tops and bottoms. To genera...
  • DiffPano: Scalable and Consistent Text to Panorama Generation with Spherical Epipolar-Aware Diffusion [reference_only/medium]: y small in scale and only have single-view panoramas, which cannot support multi-view panorama generation, except Matterport3D [4]. In addition, the sky box images in Matterport3D [4] contain only sparse views. Although HM3D [37] provides the textured mesh of 1000 scenes, it lacks the corresponding text description for each view. To generate multi-view panoramas, we render cube maps at each viewpoint in the 3D mes... || views. Although HM3D [37] provides the textured mesh of 1000 scenes, it lacks the corresponding text description for each view. To generate multi-view panoramas, we render cube maps at each viewpoint in the 3D meshes of HM3D, using the Habitat Simulator [41], and stitch them into panoramas. We generate complete text descriptions corresponding to the panoramas by using Blip2 [22] to create text descriptions for eac... || noramic images, and the corresponding text descriptions are not precise enough. To address these issues, we utilize the Habitat Simulator [41] to randomly select positions within the scenes of the Habitat Matterport 3D (HM3D) [37] dataset and render the six-face cube maps. These cube maps are then interpolated and stitched together to form panoramas so we can obtain panoramas with clear tops and bottoms. To genera...

Humus

介绍:下一轮用 web search / 官方数据集资料补充。

论文证据:

  • CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation [reference_only/medium]: h 50 steps during inference. # 5.1.2 DATASETS Training. We train on a mixture of indoor and outdoor environments by combining multiple publicly available sources, including Polyhaven (polyhaven.com, accessed 09/2024), Humus (Persson, accessed 09/2024), Structured3D (Zheng et al., 2020) and Pano360 Kocabas et al. (2021), giving in total around 48000 panoramas for training. While Humus provides an explicit cubemap r... || , including Polyhaven (polyhaven.com, accessed 09/2024), Humus (Persson, accessed 09/2024), Structured3D (Zheng et al., 2020) and Pano360 Kocabas et al. (2021), giving in total around 48000 panoramas for training. While Humus provides an explicit cubemap representations, all other datasets come with equirectangular panoramas. We thus first generate cubemaps from these panoramas using standard perspective projectio... || 2024. Nasir Mohammad Khalid, Tianhao Xie, Eugene Belilovsky, and Tiberiu Popa. CLIP-Mesh: Generating textured meshes from text using pretrained image-text models. In SIGGRAPH Asia, 2022. Emil Persson. Texture from Humus. https://www.humus.name/index.php?page=Textures, accessed 09/2024. polyhaven.com. HDRIs. https://polyhaven.com/hdris, accessed 09/2024. Ben Poole, Ajay Jain, Jonathan T Barron, and Ben Mildenhall....
  • DreamCube: 3D Panorama Generation via Multi-plane Synchronization [reference_only/medium]: our model’s generalization capabilities across diverse environments, we construct a more comprehensive dataset by combining multiple publicly available sources, including Structured3D [64], Pano360 [25], Polyhaven [40], Humus [38], HDRI-Skies [13] and iHDRI [14]. This combined dataset encompasses a broad spectrum of both indoor and outdoor environments, resulting in more than 30,000 panoramic instances. This gener... || [37] William Peebles and Saining Xie. Scalable Diffusion Models with Transformers. In Proceedings of the IEEE/CVF international conference on computer vision, pages 4195–4205, 2023. 4 [38] Emil Persson. Texture from Humus. https://www.humus.name/index.php?page=Textures, accessed 02/2025. 6 [39] Dustin Podell, Zion English, Kyle Lacey, Andreas Blattmann, Tim Dockhorn, Jonas Muller, Joe Penna, and ¨ Robin Rombach. S... || es and Saining Xie. Scalable Diffusion Models with Transformers. In Proceedings of the IEEE/CVF international conference on computer vision, pages 4195–4205, 2023. 4 [38] Emil Persson. Texture from Humus. https://www.humus.name/index.php?page=Textures, accessed 02/2025. 6 [39] Dustin Podell, Zion English, Kyle Lacey, Andreas Blattmann, Tim Dockhorn, Jonas Muller, Joe Penna, and ¨ Robin Rombach. SDXL: Improving Lat...
  • 360Anything: Geometry-Free Lifting of Images and Videos to 360° [reference_only/medium]: Vanishing Point Estimation in Uncalibrated Images with Prior Gravity Direction. ICCV (2023) 12 54. Peebles, W., Xie, S.: Scalable diffusion models with transformers. ICCV (2023) 2, 4 55. Persson, E.: Texture from Humus. https://www.humus.name/index.php?page=Textures (accessed 09/2024) 21 56. Piccinelli, L., Yang, Y.H., Sakaridis, C., Segu, M., Li, S., Van Gool, L., Yu, F.: UniDepth: Universal monocular metric dept... || timation in Uncalibrated Images with Prior Gravity Direction. ICCV (2023) 12 54. Peebles, W., Xie, S.: Scalable diffusion models with transformers. ICCV (2023) 2, 4 55. Persson, E.: Texture from Humus. https://www.humus.name/index.php?page=Textures (accessed 09/2024) 21 56. Piccinelli, L., Yang, Y.H., Sakaridis, C., Segu, M., Li, S., Van Gool, L., Yu, F.: UniDepth: Universal monocular metric depth estimation. In:... || Training Data Image data. To facilitate a fair comparison with the previous state-of-the-art method, CubeDiff [29], we follow them to use the same panorama image datasets as training data. These include Polyhaven [57], Humus [55], Structured3D [103], and Pano360 [32]. For Structured3D, we use all three subsets, namely, empty, simple, and full. As a result, around 90% of data are synthetic rendering of indoor rooms...

ImageNet

介绍:下一轮用 web search / 官方数据集资料补充。

论文证据:

  • 360-Degree Panorama Generation from Few Unregistered NFoV Images [reference_only/medium]: Jonathan Ho and Tim Salimans. 2022. Classifier-free diffusion guidance. arXiv preprint arXiv:2207.12598 (2022). [17] Tuomas Kynkäänniemi, Tero Karras, Miika Aittala, Timo Aila, and Jaakko Lehtinen. 2022. The Role of ImageNet Classes in Fr\’echet Inception Distance. arXiv preprint arXiv:2203.06026 (2022). [18] Junnan Li, Dongxu Li, Caiming Xiong, and Steven Hoi. 2022. BLIP: Bootstrapping Language-Image Pre-training...
  • CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation [reference_only/medium]: k. SPEC: Seeing people in the wild with an estimated camera. In International Conference on Computer Vision (ICCV), 2021. Tuomas Kynka¨anniemi, Tero Karras, Miika Aittala, Timo Aila, and Jaakko Lehtinen. The role of¨ ImageNet classes in Frechet Inception Distance. ´ arXiv preprint arXiv:2203.06026, 2022. Zhuqiang Lu, Kun Hu, Chaoyue Wang, Lei Bai, and Zhiyong Wang. Autoregressive omni-aware outpainting for open-vo...

LAION

介绍:下一轮用 web search / 官方数据集资料补充。

论文证据:

  • Taming Stable Diffusion for Text to 360° Panorama Image Generation [reference_only/medium]: es for training gans. In NeurIPS, pages 2226–2234, 2016. 5 [38] Christoph Schuhmann, Richard Vencu, Romain Beaumont, Robert Kaczmarczyk, Clayton Mullis, Aarush Katta, Theo Coombes, Jenia Jitsev, and Aran Komatsuzaki. Laion-400m: Open dataset of clip-filtered 400 million image-text pairs. arXiv preprint arXiv:2111.02114, 2021. 1, 2 [39] Christoph Schuhmann, Romain Beaumont, Richard Vencu, Cade Gordon, Ross Wightman... || Cade Gordon, Ross Wightman, Mehdi Cherti, Theo Coombes, Aarush Katta, Clayton Mullis, Mitchell Wortsman, Patrick Schramowski, Srivatsa Kundurthy, Katherine Crowson, Ludwig Schmidt, Robert Kaczmarczyk, and Jenia Jitsev. LAION-5B: an open large-scale dataset for training next generation image-text models. In NeurIPS, pages 25278–25294, 2022. 1, 2 [40] Ka Chun Shum, Hong-Wing Pang, Binh-Son Hua, Duc Thanh Nguyen, and...
  • 360-Degree Panorama Generation from Few Unregistered NFoV Images [reference_only/medium]: sion and pattern recognition. 4938–4947. [33] Christoph Schuhmann, Romain Beaumont, Richard Vencu, Cade Gordon, Ross Wightman, Mehdi Cherti, Theo Coombes, Aarush Katta, Clayton Mullis, Mitchell Wortsman, et al. 2022. Laion-5b: An open large-scale dataset for training next generation image-text models. arXiv preprint arXiv:2210.08402 (2022). [34] Jiaming Song, Chenlin Meng, and Stefano Ermon. 2020. Denoising diffus...
  • What Makes for Text to 360-degree Panorama Generation with Stable Diffusion? [pretrain_source/medium]: subject-driven generation. In CVPR, 2023. 1, 3 [38] Christoph Schuhmann, Romain Beaumont, Richard Vencu, Cade Gordon, Ross Wightman, Mehdi Cherti, Theo Coombes, Aarush Katta, Clayton Mullis, Mitchell Wortsman, et al. Laion-5b: An open large-scale dataset for training next generation image-text models. In NeurIPS, 2022. 1 [39] Noam Shazeer, Azalia Mirhoseini, Krzysztof Maziarz, Andy Davis, Quoc Le, Geoffrey Hinton,...
  • Twindiffusion: Enhancing coherence and efficiency in panoramic image generation with diffusion models [pretrain_source/medium]: niques for training gans. Advances in neural information processing systems, 29, 2016. [28] C. Schuhmann, R. Beaumont, R. Vencu, C. Gordon, R. Wightman, M. Cherti, T. Coombes, A. Katta, C. Mullis, M. Wortsman, et al. Laion-5b: An open large-scale dataset for training next generation imagetext models. Advances in Neural Information Processing Systems, 35: 25278–25294, 2022. [29] J. Sohl-Dickstein, E. Weiss, N. Mahe...

LAVAL Indoor

介绍:下一轮用 web search / 官方数据集资料补充。

论文证据:

  • 360-Degree Panorama Generation from Few Unregistered NFoV Images [test_eval/medium]: rama, the extra width is removed from the final image. # 5 EXPERIMENTS # 5.1 Implementation Details 5.1.1 Data Preparation. We conducted experiments using realworld 360-degree panoramic image datasets SUN360 [39] and Laval Indoor [9]. SUN360 comprises both indoor and outdoor scenes, while Laval Indoor comprises solely indoor scenes. For SUN360, we randomly selected 2000/500 panoramas for training/testing, respecti... || # 5.1 Implementation Details 5.1.1 Data Preparation. We conducted experiments using realworld 360-degree panoramic image datasets SUN360 [39] and Laval Indoor [9]. SUN360 comprises both indoor and outdoor scenes, while Laval Indoor comprises solely indoor scenes. For SUN360, we randomly selected 2000/500 panoramas for training/testing, respectively. For Laval Indoor, we followed the approach in [11] and chose 289... || [39] and Laval Indoor [9]. SUN360 comprises both indoor and outdoor scenes, while Laval Indoor comprises solely indoor scenes. For SUN360, we randomly selected 2000/500 panoramas for training/testing, respectively. For Laval Indoor, we followed the approach in [11] and chose 289 images for testing. Notably, we do not train our model using the Laval Indoor dataset as there are already indoor scenes within our selec...
  • 360-Degree Panorama Generation from Few Unregistered NFoV Images [test_eval/medium]: rama, the extra width is removed from the final image. # 5 EXPERIMENTS # 5.1 Implementation Details 5.1.1 Data Preparation. We conducted experiments using realworld 360-degree panoramic image datasets SUN360 [39] and Laval Indoor [9]. SUN360 comprises both indoor and outdoor scenes, while Laval Indoor comprises solely indoor scenes. For SUN360, we randomly selected 2000/500 panoramas for training/testing, respecti... || # 5.1 Implementation Details 5.1.1 Data Preparation. We conducted experiments using realworld 360-degree panoramic image datasets SUN360 [39] and Laval Indoor [9]. SUN360 comprises both indoor and outdoor scenes, while Laval Indoor comprises solely indoor scenes. For SUN360, we randomly selected 2000/500 panoramas for training/testing, respectively. For Laval Indoor, we followed the approach in [11] and chose 289... || [39] and Laval Indoor [9]. SUN360 comprises both indoor and outdoor scenes, while Laval Indoor comprises solely indoor scenes. For SUN360, we randomly selected 2000/500 panoramas for training/testing, respectively. For Laval Indoor, we followed the approach in [11] and chose 289 images for testing. Notably, we do not train our model using the Laval Indoor dataset as there are already indoor scenes within our selec...
  • 360-Degree Panorama Generation from Few Unregistered NFoV Images [reference_only/medium]: tent feature as ??′, ??′, respectively. Table 1: FID↓ results compared with other generation methods for quantitative evaluation.
    MethodsSUN360 [39]Laval [9]
    SinglePair (GT rots)Pair (Pred rots)SinglePair (GT rots)Pair (Pred rots)
    SIG-SS [11]13.0615.9... || rama, the extra width is removed from the final image. # 5 EXPERIMENTS # 5.1 Implementation Details 5.1.1 Data Preparation. We conducted experiments using realworld 360-degree panoramic image datasets SUN360 [39] and Laval Indoor [9]. SUN360 comprises both indoor and outdoor scenes, while Laval Indoor comprises solely indoor scenes. For SUN360, we randomly selected 2000/500 panoramas for training/testing, respecti... || # 5.1 Implementation Details 5.1.1 Data Preparation. We conducted experiments using realworld 360-degree panoramic image datasets SUN360 [39] and Laval Indoor [9]. SUN360 comprises both indoor and outdoor scenes, while Laval Indoor comprises solely indoor scenes. For SUN360, we randomly selected 2000/500 panoramas for training/testing, respectively. For Laval Indoor, we followed the approach in [11] and chose 289...
  • Panorama Generation From NFoV Image Done Right [reference_only/medium]: g the distortion prior in Distort-CLIP to further constrain the distortion. By introducing the decoupled pipeline, we achieve the best image quality and second-best distortion accuracy in SUN360 and SOTA performance in Laval Indoor in zeroshot manner. Notably, we only use 3K training data, which is 15 times less than the existing methods, but achieved surprising generalization ability, highlighting the importance... || and bottom region). The best, second-best results are in bold, underline.
    MethodYearTraining samplesSUN360Laval Indoor
    FID ↓CLIP-FID ↓Distort-FID ↓IS ↑FID ↓CLIP-FID ↓Distort-FID ↓IS ↑
    OmniDreamer2022<... || { r e c }$ is the same as Eq. 4 and λ is the coefficient which we set at 0.05 in the experiment. # 5. Experiments # 5.1. Experimental Setup Datasets. We follow PanoDiff [43] to conduct experiments in SUN360 [48] and Laval Indoor [11] datasets. SUN360 comprises both indoor and outdoor scenes while Laval Indoor only has indoor scene. We use 3000/500 SUN360 data for training/testing and 289 Laval Indoor data for zero...
  • Panorama Generation From NFoV Image Done Right [reference_only/medium]: g the distortion prior in Distort-CLIP to further constrain the distortion. By introducing the decoupled pipeline, we achieve the best image quality and second-best distortion accuracy in SUN360 and SOTA performance in Laval Indoor in zeroshot manner. Notably, we only use 3K training data, which is 15 times less than the existing methods, but achieved surprising generalization ability, highlighting the importance... || and bottom region). The best, second-best results are in bold, underline.
    MethodYearTraining samplesSUN360Laval Indoor
    FID ↓CLIP-FID ↓Distort-FID ↓IS ↑FID ↓CLIP-FID ↓Distort-FID ↓IS ↑
    OmniDreamer2022<... || { r e c }$ is the same as Eq. 4 and λ is the coefficient which we set at 0.05 in the experiment. # 5. Experiments # 5.1. Experimental Setup Datasets. We follow PanoDiff [43] to conduct experiments in SUN360 [48] and Laval Indoor [11] datasets. SUN360 comprises both indoor and outdoor scenes while Laval Indoor only has indoor scene. We use 3000/500 SUN360 data for training/testing and 289 Laval Indoor data for zero...
  • Panorama Generation From NFoV Image Done Right [pretrain_source/medium]: g the distortion prior in Distort-CLIP to further constrain the distortion. By introducing the decoupled pipeline, we achieve the best image quality and second-best distortion accuracy in SUN360 and SOTA performance in Laval Indoor in zeroshot manner. Notably, we only use 3K training data, which is 15 times less than the existing methods, but achieved surprising generalization ability, highlighting the importance... || ntNet basically follows the architecture of previous mask-based outpainting method [43]. Table 2. Comparison with SOTA methods. † means re-implementing in our setting for fair comparison. Note that the bottom region of Laval is entirely black edges and we crop 20% of it when testing image quality and undo it when testing distortion as it requires full image. (·) means the crop setting of PanoDiff (crop 20% up and... || and bottom region). The best, second-best results are in bold, underline.
    MethodYearTraining samplesSUN360Laval Indoor
    FID ↓CLIP-FID ↓Distort-FID ↓IS ↑FID ↓CLIP-FID ↓Distort-FID ↓IS ↑
    OmniDreamer2022<...

    Matterport3D

    介绍:下一轮用 web search / 官方数据集资料补充。

    论文证据:

    • Taming Stable Diffusion for Text to 360° Panorama Image Generation [reference_only/medium]: { N } \sum _ { i = 1 } ^ { N } \bar { \mathcal { L } ^ { i } } } \end{array}$ . Note that the SD UNet blocks remain frozen. # 4. Experiment # 4.1. Experimental Setup Dataset. We follow the MVDiffusion [47] to use the Matterport3D dataset [3], which has 10,800 panoramic images with 2,295 room layout annotations. We employ BLIP-2 [18] to generate a short description for each image. Implementation Details. For text-c... || hs for controlled image generation. In ICML, pages 1737–1752. PMLR, 2023. 2, 3 [3] Angel Chang, Angela Dai, Thomas Funkhouser, Maciej Halber, Matthias Niessner, Manolis Savva, Shuran Song, Andy Zeng, and Yinda Zhang. Matterport3d: Learning from rgb-d data in indoor environments. 3DV, 2017. 5, 1, 2 [4] Zhaoxi Chen, Guangcong Wang, and Ziwei Liu. Text2light: Zero-shot text-driven hdr panorama generation. ACM TOG, 41... || Thomas Wolf. Diffusers: State-of-the-art diffusion models. https://github.com/huggingface/ diffusers, 2022. 1 [50] Fu-En Wang, Yu-Hsuan Yeh, Min Sun, Wei-Chen Chiu, and Yi-Hsuan Tsai. Layoutmp3d: Layout annotation of matterport3d. arXiv preprint arXiv:2003.13516, 2020. 1 [51] Guangcong Wang, Yinuo Yang, Chen Change Loy, and Ziwei Liu. Stylelight: Hdr panorama generation for lighting estimation and editing. In ECCV...
    • What Makes for Text to 360-degree Panorama Generation with Stable Diffusion? [reference_only/medium]: spherical distortion, setting them apart from standard square perspective images. On top of this, due to the high cost of capturing panoramic images in practice, the panoramic datasets are often relatively scarce, e.g. Matterport3D [5] contains 10,800 panoramic images. The lack of data complicates the training of generative models, as conventional perspective diffusion models [35] generally require billions of tex... || pacity of Wo because of its superiority on FAED, while we highlight that such a choice may not be optimal and encourage future work to investigate further. # 4. Experiments # 4.1. Experimental Setup Dataset. Matterport3D dataset [5] is a scene understanding dataset with 10,800 panoramic images. We use the same captions as [56], which are generated by BLIP-2 [20] with a prompt of “a 360 - degree view of”.... || sion: Fusing diffusion paths for controlled image generation. In ICML, 2023. 3 [5] Angel Chang, Angela Dai, Thomas Funkhouser, Maciej Halber, Matthias Niessner, Manolis Savva, Shuran Song, Andy Zeng, and Yinda Zhang. Matterport3d: Learning from rgb-d data in indoor environments. In 3DV, 2017. 1, 6, 11 [6] Shoufa Chen, Peize Sun, Yibing Song, and Ping Luo. Diffusiondet: Diffusion model for object detection. In ICCV...
    • DiffPano++: Scalable and Consistent Multi-View Panorama Generation with Spherical Epipolar-Aware Diffusion [reference_only/medium]: View 5: An empty room with white doors. View 6: The carpet in a bedroom."] M --> N["..."] N --> O["BLIP2"] ``` Figure 2: Panoramic Video Construction and Caption Pipeline. Standford 2D-3D-S [1], and Matterport3D dataset [4], etc. Most of these datasets are relatively small in scale and only have single-view panoramas, which cannot support multi-view panorama generation, except Matterport3D [4]. In addit... || Pipeline. Standford 2D-3D-S [1], and Matterport3D dataset [4], etc. Most of these datasets are relatively small in scale and only have single-view panoramas, which cannot support multi-view panorama generation, except Matterport3D [4]. In addition, the sky box images in Matterport3D [4] contain only sparse views. Although HM3D [37] provides the textured mesh of 1000 scenes, it lacks the corresponding text descript... || ataset [4], etc. Most of these datasets are relatively small in scale and only have single-view panoramas, which cannot support multi-view panorama generation, except Matterport3D [4]. In addition, the sky box images in Matterport3D [4] contain only sparse views. Although HM3D [37] provides the textured mesh of 1000 scenes, it lacks the corresponding text description for each view. To generate multi-view panoramas...
    • Spherical manifold guided diffusion model for panoramic image generation [reference_only/medium]: ordingly, with FIDequ and FIDpole calculated separately for each group. # 4. Experiment # 4.1. Experimental Settings Dataset. We employed the widely-used Matterport3D [4] dataset to evaluate the performance for text-conditioned panoramic image generation. More specifically, we obtained 10,912 panoramic images with a resolution of 1024 × 512 according t... || image generation. In International Conference on Machine Learning, 2023. 2, 6 [4] Angel Chang, Angela Dai, Thomas Funkhouser, Maciej Halber, Matthias Niessner, Manolis Savva, Shuran Song, Andy Zeng, and Yinda Zhang. Matterport3d: Learning from rgb-d data in indoor environments. arXiv preprint arXiv:1709.06158, 2017. 5 [5] Zhaoxi Chen, Guangcong Wang, and Ziwei Liu. Text2light: Zero-shot text-driven hdr panorama ge...
    • Spherical-nested diffusion model for panoramic image outpainting [test_eval/medium]: image outpainting without introducing distortion to the provided regions. # 4. Experimental Results # 4.1. Experimental Settings Datasets. To evaluate the performance of our SpND model, we employed the widely applied Matterport3D (Chang et al., 2017) and Structured3D (Zheng et al., 2020) dataset for comparison. Similar to (Lin et al., 2019), we obtained 10912 panoramic images with size 1024 × 512 for the Matterpor... || employed the widely applied Matterport3D (Chang et al., 2017) and Structured3D (Zheng et al., 2020) dataset for comparison. Similar to (Lin et al., 2019), we obtained 10912 panoramic images with size 1024 × 512 for the Matterport3D dataset. A total of 9, 820 images were selected for the training, and all 1, 0912 images were used for evaluation to compute the sufficient statistics. For the Structured3D dataset, we... || h application-specific prompts, we trained an additional model incorporating varying text prompts, denoted as SpNDprompt . Table 1. Quantitative comparisons with state-of-the-art methods on Matterport3D and Structured3D. The best results and the second-best results are highlighted in bold, underline.
      -Matterport3DStructured3D</td...
    • Conditional Panoramic Image Generation via Masked Autoregressive Modeling [reference_only/medium]: ons holistically reconcile the unique demands of 360-degree imagery with the flexibility of MAR modeling. Experiments demonstrate that PAR outperforms specialist models in the textto-panorama task, with 37.37 FID on the Matterport3D dataset. Moreover, on the outpainting task, it has better generation quality and avoids the problem of repeated structure generation. Ablation studies demonstrate the model’s scalabili... || G [23] coefficient as 5. In this paper, the masking sequence for training and the sampling sequence for inference are initialized with a uniform distribution unless otherwise stated. Datasets and Metrics. We mainly use Matterport3D [6] for comparisons. The split of the training and validation set follows PanFusion [74]. We use Janus-Pro-7B [9] to generate the captions. Table 1: Performance on text-to-panorama task... || 3703693f96434d3cc07c8882dc2f3a2e956153ad5c8b.jpg) (e) PAR w/ prompt Figure 4: Qualitative comparisons of panorama outpainting on the Matterport3D dataset. PAR-1.4B is used for this task, where PAR w/o prompt means the textual prompt is set as empty. Several metrics are used in this paper. Fréchet Inception Distance (FID) [21] measures...
    • Pano360

      介绍:下一轮用 web search / 官方数据集资料补充。

      论文证据:

      • CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation [reference_only/medium]: ixture of indoor and outdoor environments by combining multiple publicly available sources, including Polyhaven (polyhaven.com, accessed 09/2024), Humus (Persson, accessed 09/2024), Structured3D (Zheng et al., 2020) and Pano360 Kocabas et al. (2021), giving in total around 48000 panoramas for training. While Humus provides an explicit cubemap representations, all other datasets come with equirectangular panoramas....
      • DreamCube: 3D Panorama Generation via Multi-plane Synchronization [reference_only/medium]: setting. To further evaluate our model’s generalization capabilities across diverse environments, we construct a more comprehensive dataset by combining multiple publicly available sources, including Structured3D [64], Pano360 [25], Polyhaven [40], Humus [38], HDRI-Skies [13] and iHDRI [14]. This combined dataset encompasses a broad spectrum of both indoor and outdoor environments, resulting in more than 30,000 pa...
      • 360Anything: Geometry-Free Lifting of Images and Videos to 360° [reference_only/medium]: tate a fair comparison with the previous state-of-the-art method, CubeDiff [29], we follow them to use the same panorama image datasets as training data. These include Polyhaven [57], Humus [55], Structured3D [103], and Pano360 [32]. For Structured3D, we use all three subsets, namely, empty, simple, and full. As a result, around 90% of data are synthetic rendering of indoor rooms from Structured3D. We then use Gem...

      Pano3D

      介绍:下一轮用 web search / 官方数据集资料补充。

      论文证据:

      • Omni2: Unifying Omnidirectional Image Generation and Editing in an Omni Model [reference_only/medium]: ng InternVL2-5 [9] to generate text condition. For inpainting and outpainting tasks, images are randomly masked as input image conditions. For semantic2ODI and depth2ODI tasks, we use paired images from Structured3D and Pano3D [3], respectively, and generate captions to provide text conditions for these tasks. The detailed construction process is provided in the supplementary materials. # 3.2 Editing Subset While... || Vision and Pattern Recognition (CVPR). 11441–11450. [3] Georgios Albanis, Nikolaos Zioulis, Petros Drakoulis, Vasileios Gkitsas, Vladimiros Sterzentsenko, Federico Alvarez, Dimitrios Zarpalas, and Petros Daras. 2021. Pano3d: A holistic benchmark and a solid baseline for 360deg depth estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 3727–3737. [4] Omer Bar-Tal,... || tion. To bridge this gap, we introduce a Depth2Image task for ODI and construct a dedicated subset using existing ODI depth estimation datasets. Specifically, we select images and their corresponding depth maps from the Pano3D dataset [3] and generate scene descriptions using Internvl2-5. A.2.4 Semantic to Image. Structured3D dataset [55] offers a diverse collection of indoor ODIs and their corresponding semantic...

      Poly Haven

      介绍:下一轮用 web search / 官方数据集资料补充。

      论文证据:

      • CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation [reference_only/medium]: employ DDIM sampling (Song et al., 2020) with 50 steps during inference. # 5.1.2 DATASETS Training. We train on a mixture of indoor and outdoor environments by combining multiple publicly available sources, including Polyhaven (polyhaven.com, accessed 09/2024), Humus (Persson, accessed 09/2024), Structured3D (Zheng et al., 2020) and Pano360 Kocabas et al. (2021), giving in total around 48000 panoramas for training... || M sampling (Song et al., 2020) with 50 steps during inference. # 5.1.2 DATASETS Training. We train on a mixture of indoor and outdoor environments by combining multiple publicly available sources, including Polyhaven (polyhaven.com, accessed 09/2024), Humus (Persson, accessed 09/2024), Structured3D (Zheng et al., 2020) and Pano360 Kocabas et al. (2021), giving in total around 48000 panoramas for training. While Hu... || Popa. CLIP-Mesh: Generating textured meshes from text using pretrained image-text models. In SIGGRAPH Asia, 2022. Emil Persson. Texture from Humus. https://www.humus.name/index.php?page=Textures, accessed 09/2024. polyhaven.com. HDRIs. https://polyhaven.com/hdris, accessed 09/2024. Ben Poole, Ajay Jain, Jonathan T Barron, and Ben Mildenhall. DreamFusion: Text-to-3d using 2d diffusion. arXiv preprint arXiv:2209.149...
      • TanDiT: Tangent-Plane Diffusion Transformer for High-Quality 360 {\deg} Panorama Generation [reference_only/medium]: ality images with a 1:2 aspect ratio. An overview of this inference pipeline is shown in Figure 3. # 4 Experimental Setup # 4.1 Dataset We use a combination of 3 different datasets: Matterport3D [25] (∼ 10K images) , Polyhaven [26] (∼ 750 images), and Flickr360 [27] (∼ 3K images). None of these datasets come with text captions, and so we compute these ourselves. We employ LLava-OneVision [28] to produce rich, dens... || Maciej Halber, Matthias Nießner, Manolis Savva, Shuran Song, Andy Zeng, and Yinda Zhang. Matterport3D: Learning from RGB-D data in indoor environments. In Proc. International Conference on 3D Vision (3DV), 2017. [26] polyhaven.com. HDRIs. https://polyhaven.com/hdris, 2025. Accessed: 2025-03. [27] Mingdeng Cao, Chong Mou, Fanghua Yu, Xintao Wang, Yinqiang Zheng, Jian Zhang, Chao Dong, Gen Li, Ying Shan, Radu Timoft... || r, Manolis Savva, Shuran Song, Andy Zeng, and Yinda Zhang. Matterport3D: Learning from RGB-D data in indoor environments. In Proc. International Conference on 3D Vision (3DV), 2017. [26] polyhaven.com. HDRIs. https://polyhaven.com/hdris, 2025. Accessed: 2025-03. [27] Mingdeng Cao, Chong Mou, Fanghua Yu, Xintao Wang, Yinqiang Zheng, Jian Zhang, Chao Dong, Gen Li, Ying Shan, Radu Timofte, Xiaopeng Sun, Weiqi Li, Zhe...
      • DreamCube: 3D Panorama Generation via Multi-plane Synchronization [reference_only/medium]: urther evaluate our model’s generalization capabilities across diverse environments, we construct a more comprehensive dataset by combining multiple publicly available sources, including Structured3D [64], Pano360 [25], Polyhaven [40], Humus [38], HDRI-Skies [13] and iHDRI [14]. This combined dataset encompasses a broad spectrum of both indoor and outdoor environments, resulting in more than 30,000 panoramic insta... || lattmann, Tim Dockhorn, Jonas Muller, Joe Penna, and ¨ Robin Rombach. SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis. In International Conference on Learning Representations, 2024. 4 [40] polyhaven.com. HDRIs. https://polyhaven.com/hdris, accessed 02/2025. 6 [41] Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bjorn Ommer. High-Resolution Image ¨ Synthesis with Lat... || Muller, Joe Penna, and ¨ Robin Rombach. SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis. In International Conference on Learning Representations, 2024. 4 [40] polyhaven.com. HDRIs. https://polyhaven.com/hdris, accessed 02/2025. 6 [41] Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bjorn Ommer. High-Resolution Image ¨ Synthesis with Latent Diffusion Models. In Proce...
      • 360Anything: Geometry-Free Lifting of Images and Videos to 360° [reference_only/medium]: name/index.php?page=Textures (accessed 09/2024) 21 56. Piccinelli, L., Yang, Y.H., Sakaridis, C., Segu, M., Li, S., Van Gool, L., Yu, F.: UniDepth: Universal monocular metric depth estimation. In: CVPR (2024) 12 57. polyhaven.com: HDRIs. https://polyhaven.com/hdris (accessed 09/2024) 21 58. Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., et al.... || accessed 09/2024) 21 56. Piccinelli, L., Yang, Y.H., Sakaridis, C., Segu, M., Li, S., Van Gool, L., Yu, F.: UniDepth: Universal monocular metric depth estimation. In: CVPR (2024) 12 57. polyhaven.com: HDRIs. https://polyhaven.com/hdris (accessed 09/2024) 21 58. Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., et al.: Learning transferable visual... || r model. # A.1 Training Data Image data. To facilitate a fair comparison with the previous state-of-the-art method, CubeDiff [29], we follow them to use the same panorama image datasets as training data. These include Polyhaven [57], Humus [55], Structured3D [103], and Pano360 [32]. For Structured3D, we use all three subsets, namely, empty, simple, and full. As a result, around 90% of data are synthetic rendering...

      RealEstate10K

      介绍:下一轮用 web search / 官方数据集资料补充。

      论文证据:

      • 360Anything: Geometry-Free Lifting of Images and Videos to 360° [test_eval/medium]: conditioning video are highlighted in red. We test on perspective videos generated by other video models. 3D scene reconstruction. We show more qualitative results in Figure 10. Given a narrow field-of-view video from RealEstate10K [105], 360Anything synthesizes the entire 360◦ view of the room. We can then train a 3D Gaussian Splatting model [30] on the generated panoramas for novel view synthesis. This demonstra...

      SUN360

      介绍:下一轮用 web search / 官方数据集资料补充。

      论文证据:

      • Diffusion360: Seamless 360 Degree Panoramic Image Generation based on Diffusion Models [reference_only/medium]: t-to-360-Panoramas task, we propose a multistage framework to generate high resolution 360-degree panoramic images. As illustrated in Fig. 3, we first generate a low resolution image using a base model (finetuned on the SUN360 [7] dataset using the DreamBooth [2] training method), and then employ some super-resolution strategies (including diffusion-based and the GAN-based methods, like the ControlNet-Tile model a... || ng diffusion-based and the GAN-based methods, like the ControlNet-Tile model and the RealESRGAN [6]) to up-scale the result to a high resolution one. For better results, we also finetune the ControlNet-Tile model on the SUN360 dataset by generate low-resolution and highresolution image pairs. # 2.3. Single-Image-to-360-Panoramas For the Single-Image-to-360-Panoramas task, the framework is similar to the Text-to-36...
      • 360-Degree Panorama Generation from Few Unregistered NFoV Images [reference_only/medium]: e right and left sides of the latent feature as ??′, ??′, respectively. Table 1: FID↓ results compared with other generation methods for quantitative evaluation.
        MethodsSUN360 [39]Laval [9]
        SinglePair (GT rots)Pair (Pred rots)SinglePair (GT rots)Pair (Pred rots)
        SIG-SS... || a standard panorama, the extra width is removed from the final image. # 5 EXPERIMENTS # 5.1 Implementation Details 5.1.1 Data Preparation. We conducted experiments using realworld 360-degree panoramic image datasets SUN360 [39] and Laval Indoor [9]. SUN360 comprises both indoor and outdoor scenes, while Laval Indoor comprises solely indoor scenes. For SUN360, we randomly selected 2000/500 panoramas for training/te... || dth is removed from the final image. # 5 EXPERIMENTS # 5.1 Implementation Details 5.1.1 Data Preparation. We conducted experiments using realworld 360-degree panoramic image datasets SUN360 [39] and Laval Indoor [9]. SUN360 comprises both indoor and outdoor scenes, while Laval Indoor comprises solely indoor scenes. For SUN360, we randomly selected 2000/500 panoramas for training/testing, respectively. For Laval In...
      • Panorama Generation From NFoV Image Done Right [pretrain_source/medium]: rtion correction loss, utilizing the distortion prior in Distort-CLIP to further constrain the distortion. By introducing the decoupled pipeline, we achieve the best image quality and second-best distortion accuracy in SUN360 and SOTA performance in Laval Indoor in zeroshot manner. Notably, we only use 3K training data, which is 15 times less than the existing methods, but achieved surprising generalization abilit... || to generate no-distortion (perspective) images that are consistent with the panoramic content to validate the model’s capability systematically. Specifically, we first extract the central region of panoramas data (i.e., SUN360 [48]) and project it into perspective Table 1. Comparison of our Distort-CLIP with other models used in evaluation metric. We show the feature similarity (range from -1 to 1) between differe... || athrm{ie}}. \tag {3} $$ Results. As shown in Table 1, after fine-tuning the CLIP, we significantly improve the ability to distinguish the distortion both in image-image and image-text manners. Note that the test set of SUN360 (Panotest) is not covered within the training data range, yet Distort-CLIP can still accurately determine that their distortion types are the same, sh...
      • CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation [test_eval/medium]: all six cube faces as input and (2) generating individual captions for each face independently, enabling fine-grained text control. Testing. We evaluate our method on the common Laval Indoor (Gardner et al., 2017) and Sun360 (Xiao et al., 2018) datasets. Laval Indoor consists of over 2100 high quality panorama captures of various indoor environments, Sun360 encompasses around 1000 panoramas including both – indoor... || esting. We evaluate our method on the common Laval Indoor (Gardner et al., 2017) and Sun360 (Xiao et al., 2018) datasets. Laval Indoor consists of over 2100 high quality panorama captures of various indoor environments, Sun360 encompasses around 1000 panoramas including both – indoor and outdoor scenes. Note that we use those datasets only for evaluation, while Diffusion360 also uses Sun360 for training and OmniDr... || ty panorama captures of various indoor environments, Sun360 encompasses around 1000 panoramas including both – indoor and outdoor scenes. Note that we use those datasets only for evaluation, while Diffusion360 also uses Sun360 for training and OmniDreamer even leverages both datasets to train their models. Nonetheless, we decided to use these datasets for the sake of fairness and due to the lack of any proper over...
      • Conditional Panoramic Image Generation via Masked Autoregressive Modeling [reference_only/medium]: stent with the scaling of parameters and computation. Larger models learn data distributions better, such as details and panoramic geometry. Generalization. Fig. 6 visualizes the outpainting results of PAR on OOD data. SUN360 [66] is used for evaluation, which has different data distributions from Matterport3D. PAR shows decent performance across various scenarios. We also compared with several PO baselines traine... || on OOD data. SUN360 [66] is used for evaluation, which has different data distributions from Matterport3D. PAR shows decent performance across various scenarios. We also compared with several PO baselines trained on the SUN360 dataset. However, Diffusion360 [18] suffers from unrealistic scenes and lacks details. Panodiff [62] also encounters artifact problems (red sky in the 1st row). PAR can also adapt to differe... || age showing a street scene with a covered-roofed building and a close-up of a supermarket aisle (no visible text or signage) (d) PAR (ours) Figure 6: Panorama outpainting on OOD dataset. The images are from SUN360, which is out of the distribution of our training data. PAR generates realistic panorama images while previous methods have problems like artifacts, or unrealistic results. ![](images/304fa5b4...
      • JoPano: Unified Panorama Generation via Joint Modeling [reference_only/medium]: ze of 1 per GPU. We adopt a cubemap resolution of 512 × 512 × 6 and convert the generated cubemaps to ERP panoramas at 2048 × 1024 for visualization. Dataset We use the Structure3D [66] and SUN360 [57] datasets for training, containing 41,930 panoramas in total. Following [19, 20, 56], we divide the Structure3D dataset into 16,930 panoramas for training, 2,116 for validation, and 2,117... || ,930 panoramas in total. Following [19, 20, 56], we divide the Structure3D dataset into 16,930 panoramas for training, 2,116 for validation, and 2,117 for testing, and we use Qwen2.5-VL [4] to caption each panorama. For SUN360, we adopt the version provided by PanoDecouple [65], which contains 25,000 training and 4,260 testing panoramas paired with their corresponding text descriptions. We use the test set of 2,11... || [65], which contains 25,000 training and 4,260 testing panoramas paired with their corresponding text descriptions. We use the test set of 2,117 panoramas from Structure3D (mostly indoor scenes) and 4,260 panoramas from SUN360 (mostly outdoor scenes) to evaluate the performance of T2P and V2P, respectively. Evaluation Metrics We evaluate our method using six metrics. To assess image quality, we report FID [16], CL...
      • ScanNet

        介绍:下一轮用 web search / 官方数据集资料补充。

        论文证据:

        • 360Anything: Geometry-Free Lifting of Images and Videos to 360° [test_eval/medium]: H., Ouyang, W., He, T., Zhao, C., Zhang, G.: DiffPano: Scalable and consistent text to panorama generation with spherical epipolar-aware diffusion. NeurIPS (2024) 3 96. Yeshwanth, C., Liu, Y.C., Nießner, M., Dai, A.: ScanNet++: A high-fidelity dataset of 3d indoor scenes. In: ICCV (2023) 11, 21 97. Yin, T., Zhang, Q., Zhang, R., Freeman, W.T., Durand, F., Shechtman, E., Huang, X.: From slow bidirectional to fast a...
        • MVDiffusion: Enabling Holistic Multi-view Image Generation with Correspondence-Aware Diffusion [test_eval/medium]: h regenerates the conditioned images, and when the mask is zero, the branch generates the in-between images. Training. we adopt a two-stage training process. In the first stage, we fine-tune the SD UNet model using all ScanNet data. This stage is single-view training (Eq. 1) without the CAA blocks. In the second stage, we integrate the CAA blocks, and the image condition blocks into the UNet, and only these added... || ementary material. # 5.2 Multi view depth-to-image generation This task converts a sequence of depth images into a sequence of RGB images while preserving the underlying geometry and maintaining multiview consistency. ScanNet is an indoor RGB-D video dataset comprising over 1513 training scenes and 100 testing scenes, all with known camera parameters. We train our model on the training scenes and evaluate it on th... || nditioned image generation Training and inference details. Our generation model is derived from the stable-diffusion-2-depth framework [46]. In the initial phase, we fine-tune the model on all the perspective images of ScanNet dataset at a resolution of 192 × 256 for 10 epochs. This training process employs the AdamW optimizer [25] with a learning rate of 1e−5 and a batch size of 256, utilizing four A6...

        Structured3D

        介绍:下一轮用 web search / 官方数据集资料补充。

        论文证据:

        • PanoDiffusion: 360-degree Panorama Outpainting via Diffusion [test_eval/medium]: a two-end alignment mechanism is applied at each step of the inference denoising process (Fig. 4), which explicitly enforces the two ends of an image to be wraparound-consistent. We evaluate the proposed method on the Structured3D dataset (Zheng et al., 2020). Experimental results demonstrate that our PanoDiffusion not only significantly outperforms previous state-of-theart methods, but is also able to provide mul... || ar image patterns, we trained a super-resolution GAN model for panoramas to produce visually plausible results at a higher resolution. # 4 EXPERIMENTS # 4.1 EXPERIMENTAL DETAILS Dataset. We estimated our model on the Structured3D dataset (Zheng et al., 2020), which provides 360° indoor RGB-D data following equirectangular projection with a 512×1024 resolution. We split the dataset into 16930 train, 2116 validation... || uality of depth panorama, we compare our method with three image-guided depth synthesis methods including BIPS (Oh et al., 2022), NLSPN (Park et al., 2020), and CSPN (Cheng et al., 2018). All models are retrained on the Structured3D dataset using their publicly available codes. # 4.2 MAIN RESULTS Following prior works, we report the quantitative results for RGB panorama outpainting with camera masks in Table 1. Al...
        • DiffPano++: Scalable and Consistent Multi-View Panorama Generation with Spherical Epipolar-Aware Diffusion [reference_only/medium]: main limitations of this task is the lack of suitable datasets. The common panoramic datasets used in single-view panorama generation consist of indoor HDR dataset [16], outdoor HDR dataset [73], HDR360-UHD dataset [9], Structured3D [75],
          flowchart ```mermaid graph TD A["3D Scene"] --> B["Habitat Simulator... || d and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech), pages 341–348. IEEE, 2021. [75] Jia Zheng, Junfei Zhang, Jing Li, Rui Tang, Shenghua Gao, and Zihan Zhou. Structured3d: A large photo-realistic dataset for structured 3d modeling. In Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm, editors, Computer Vision - ECCV 2020 - 16th European Con...
        • Spherical manifold guided diffusion model for panoramic image generation [reference_only/medium]: hu, Feng Dai, Yike Ma, Guoqing Jin, and Yongdong Zhang. Distortion-aware cnns for spherical images. In IJCAI, pages 1198–1204, 2018. 2, 3 [56] Jia Zheng, Junfei Zhang, Jing Li, Rui Tang, Shenghua Gao, and Zihan Zhou. Structured3d: A large photo-realistic dataset for structured 3d modeling. In Proceedings of The European Conference on Computer Vision (ECCV), 2020. 6 [57] Yufan Zhou, Ruiyi Zhang, Changyou Chen, Chun...
        • Spherical-nested diffusion model for panoramic image outpainting [test_eval/medium]: distortion to the provided regions. # 4. Experimental Results # 4.1. Experimental Settings Datasets. To evaluate the performance of our SpND model, we employed the widely applied Matterport3D (Chang et al., 2017) and Structured3D (Zheng et al., 2020) dataset for comparison. Similar to (Lin et al., 2019), we obtained 10912 panoramic images with size 1024 × 512 for the Matterport3D dataset. A total of 9, 820 images... || panoramic images with size 1024 × 512 for the Matterport3D dataset. A total of 9, 820 images were selected for the training, and all 1, 0912 images were used for evaluation to compute the sufficient statistics. For the Structured3D dataset, we followed the methodology outlined in (Wu et al., 2024b) to obtain 21,133 images, of which 19,019 images were used for training and all 21,133 images were used for evaluation... || cific prompts, we trained an additional model incorporating varying text prompts, denoted as SpNDprompt . Table 1. Quantitative comparisons with state-of-the-art methods on Matterport3D and Structured3D. The best results and the second-best results are highlighted in bold, underline.
          -Matterport3DStructured3D
          Met...
        • CubeDiff: Repurposing Diffusion-Based Image Models for Panorama Generation [reference_only/medium]: .2 DATASETS Training. We train on a mixture of indoor and outdoor environments by combining multiple publicly available sources, including Polyhaven (polyhaven.com, accessed 09/2024), Humus (Persson, accessed 09/2024), Structured3D (Zheng et al., 2020) and Pano360 Kocabas et al. (2021), giving in total around 48000 panoramas for training. While Humus provides an explicit cubemap representations, all other datasets... || Park, Ricardo Martin Brualla, and Philipp Henzler. IllumiNeRF: 3d relighting without inverse rendering. arXiv preprint arXiv:2406.06527, 2024. Jia Zheng, Junfei Zhang, Jing Li, Rui Tang, Shenghua Gao, and Zihan Zhou. Structured3d: A large photo-realistic dataset for structured 3d modeling. In European Conference on Computer Vision (ECCV), 2020. ![](images/aa008fd009ec12990c64804a8e4f4e859cafb79128a2c31f46813a086ba... || conducted an ablation study by training CubeDiff on three subsets of panoramic data: a tiny dataset containing approximately 700 panoramas from the Polyhaven dataset, a medium dataset of about 20,000 panoramas from the Structured3D dataset (the same dataset PanoDiffusion used and comparable in size to MVDiffusion), and a full dataset with over 40,000 panoramas. The results demonstrate that Cube-Diff performs robus...
        • Conditional Panoramic Image Generation via Masked Autoregressive Modeling [reference_only/medium]: Li, Chengfei Lv, Jian-Fang Hu, and Wei-Shi Zheng. Panorama generation from nfov image done right. arXiv preprint arXiv:2503.18420, 2025. 2 [78] Jia Zheng, Junfei Zhang, Jing Li, Rui Tang, Shenghua Gao, and Zihan Zhou. Structured3d: A large photo-realistic dataset for structured 3d modeling. In ECCV, 2020. 16 [79] Junwei Zhou, Xueting Li, Lu Qi, and Ming-Hsuan Yang. Layout-your-3d: Controllable and precise 3d gener... || FID when CFG=3 and 39.76 FID when CFG = 10, indicating that too large or too small guidance strength deteriorates the quality of the generation. Ablations on other datasets. We include additional experiments using the Structured3D [78] dataset, which is a large-scale synthesized indoor dataset for house design with well-preserved poles. We sampled 9000 images for training and 1000 for testing. In Tab. 10, our meth... || 100 text prompts sampled from a subset of SUN360, respectively. Our method achieves a DS score of 0.63, outperforming StitchDiffusion’s 1.12. For zero-shot outpainting, we compare the two methods on 100 images from the Structured3D dataset, which is excluded from both training sets for fairness. As shown in Tab. 11, our model outperforms Diffusion360. Failure case. As shown in Fig. 15, our model may fail in some d...
        • iHDRI

          介绍:下一轮用 web search / 官方数据集资料补充。

          论文证据:

          • DreamCube: 3D Panorama Generation via Multi-plane Synchronization [reference_only/medium]: ilities across diverse environments, we construct a more comprehensive dataset by combining multiple publicly available sources, including Structured3D [64], Pano360 [25], Polyhaven [40], Humus [38], HDRI-Skies [13] and iHDRI [14]. This combined dataset encompasses a broad spectrum of both indoor and outdoor environments, resulting in more than 30,000 panoramic instances. This general setting allows us to evaluate... || ny camera: Zero-shot metric depth estimation from any camera. arXiv preprint arXiv:2501.02464, 2025. 3, 7, 8 [13] hdri skies. HDRIs. https://hdri-skies.com/, accessed 02/2025. 6 [14] hdri skies. HDRIs. https://www.ihdri.com/hdri-skiesoutdoor/, accessed 02/2025. 6 [15] Jing He, Haodong Li, Wei Yin, Yixun Liang, Leheng Li, Kaiqiang Zhou, Hongbo Zhang, Bingbing Liu, and Ying-Cong Chen. Lotus: Diffusion-based visual f...