-
Notifications
You must be signed in to change notification settings - Fork 64
SDGym should be able to automatically discover SDV Enterprise synthesizers #489
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #489 +/- ##
==========================================
+ Coverage 74.86% 76.04% +1.17%
==========================================
Files 29 30 +1
Lines 2399 2371 -28
==========================================
+ Hits 1796 1803 +7
+ Misses 603 568 -35
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@amontanez24 @pvk-developer Two things to consider:
|
If I had to write this as per the issue is saying, we should get all the synthesizers that are accessed by In [3]: from sdv import single_table
In [4]: dict([(name, cls) for name, cls in single_table.__dict__.items() if isinstance(cls, type)])
Out[4]:
{'CopulaGANSynthesizer': sdv.single_table.copulagan.CopulaGANSynthesizer,
'GaussianCopulaSynthesizer': sdv.single_table.copulas.GaussianCopulaSynthesizer,
'CTGANSynthesizer': sdv.single_table.ctgan.CTGANSynthesizer,
'TVAESynthesizer': sdv.single_table.ctgan.TVAESynthesizer,
'DayZSynthesizer': sdv_enterprise.sdv.single_table.dayz.day_zero.DayZSynthesizer,
'BootstrapSynthesizer': sdv_enterprise.sdv.single_table.bootstrap.bootstrap.BootstrapSynthesizer,
'DPGCFlexSynthesizer': sdv_enterprise.sdv.single_table.differential_privacy.dp_gc_flex_synthesizer.DPGCFlexSynthesizer,
'DPGCSynthesizer': sdv_enterprise.sdv.single_table.differential_privacy.dp_gc_synthesizer.DPGCSynthesizer,
'SegmentSynthesizer': sdv_enterprise.sdv.single_table.segment.segment.SegmentSynthesizer,
'XGCSynthesizer': sdv_enterprise.sdv.single_table.xgc.xgc.XGCSynthesizer}Then my other comment on how you got the subclasses can go away and you could just use this functionality instead. |
amontanez24
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might be missing something but my main thought is that we don't need to make the classes importable. I think we can just create a class object from a string matching a valid SDV synthesizer when it is requested by the benchmark.
9d27f3d to
cf9ea70
Compare
pvk-developer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
sdgym/utils.py
Outdated
| synthesizer_name = synthesizer | ||
| if synthesizer in st_sdv_synthesizers + mt_sdv_synthesizers: | ||
| modality = 'single_table' if synthesizer in st_sdv_synthesizers else 'multi_table' | ||
| instance = BaselineSDVSynthesizer(synthesizer, modality) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of directly returning an instance of BaselineSDVSynthesizer, can you add a function, create_sdv_synthesizer_class and in it return something like:
CustomSynthesizer = type(
class_name, # Should be the synthesizer name in sdv
(BaselineSynthesizer,),
{
'__module__': __name__,
'get_trained_synthesizer': get_trained_synthesizer,
'sample_from_synthesizer': sample_from_synthesizer,
},
)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes done in 33e5813
|
@amontanez24 did some additional tests locally and also checked that it works on ec2 instances. |
Resolve #481
Resolve #491
CU-86b7b0kaa
I share this notebook to try out the feature.