Skip to content

Commit cdb42ac

Browse files
committed
make release-tag: Merge branch 'master' into stable
2 parents 134547a + 8de21bd commit cdb42ac

11 files changed

Lines changed: 84 additions & 26 deletions

File tree

AUTHORS.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -10,3 +10,4 @@ Contributors
1010
------------
1111

1212
* Carles Sala <csala@csail.mit.edu>
13+
* Kevin Kuo <kevinykuo@gmail.com>

CONTRIBUTING.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -232,6 +232,6 @@ or in command line::
232232
pip install 'ctgan>=X.Y.Z.dev'
233233

234234

235-
.. _GitHub issues page: https://github.com/DAI-Lab/CTGAN/issues
236-
.. _Travis Build Status page: https://travis-ci.org/DAI-Lab/CTGAN/pull_requests
235+
.. _GitHub issues page: https://github.com/sdv-dev/CTGAN/issues
236+
.. _Travis Build Status page: https://travis-ci.org/sdv-dev/CTGAN/pull_requests
237237
.. _Google docstrings style: https://google.github.io/styleguide/pyguide.html?showone=Comments#Comments

HISTORY.md

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,26 @@
11
# History
22

3+
## v0.2.1 - 2020-01-27
4+
5+
Minor version including changes to ensure the logs are properly printed and
6+
the option to disable the log transformation to the discrete column frequencies.
7+
8+
Special thanks to @kevinykuo for the contributions!
9+
10+
### Issues Resolved:
11+
12+
* Option to sample from true data frequency instead of logged frequency - [Issue #16](https://github.com/sdv-dev/CTGAN/issues/16) by @kevinykuo
13+
* Flush stdout buffer for epoch updates - [Issue #14](https://github.com/sdv-dev/CTGAN/issues/14) by @kevinykuo
14+
315
## v0.2.0 - 2019-12-18
416

517
Reorganization of the project structure with a new Python API, new Command Line Interface
618
and increased data format support.
719

820
### Issues Resolved:
921

10-
* Reorganize the project structure - [Issue #10](https://github.com/DAI-Lab/CTGAN/issues/10) by @csala
11-
* Move epochs to the fit method - [Issue #5](https://github.com/DAI-Lab/CTGAN/issues/5) by @csala
22+
* Reorganize the project structure - [Issue #10](https://github.com/sdv-dev/CTGAN/issues/10) by @csala
23+
* Move epochs to the fit method - [Issue #5](https://github.com/sdv-dev/CTGAN/issues/5) by @csala
1224

1325
## v0.1.0 - 2019-11-07
1426

README.md

Lines changed: 24 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,26 +1,26 @@
11
<p align="left">
2-
<img width=15% src="https://dai.lids.mit.edu/wp-content/uploads/2018/06/Logo_DAI_highres.png" alt=DAI-Lab />
2+
<img width=15% src="https://dai.lids.mit.edu/wp-content/uploads/2018/06/Logo_DAI_highres.png" alt=sdv-dev />
33
<i>An open source project from Data to AI Lab at MIT.</i>
44
</p>
55

66
[![PyPI Shield](https://img.shields.io/pypi/v/ctgan.svg)](https://pypi.python.org/pypi/ctgan)
7-
[![Travis CI Shield](https://travis-ci.org/DAI-Lab/CTGAN.svg?branch=master)](https://travis-ci.org/DAI-Lab/CTGAN)
7+
[![Travis CI Shield](https://travis-ci.org/sdv-dev/CTGAN.svg?branch=master)](https://travis-ci.org/sdv-dev/CTGAN)
88
[![Downloads](https://pepy.tech/badge/ctgan)](https://pepy.tech/project/ctgan)
9-
[![Coverage Status](https://codecov.io/gh/DAI-Lab/CTGAN/branch/master/graph/badge.svg)](https://codecov.io/gh/DAI-Lab/CTGAN)
9+
[![Coverage Status](https://codecov.io/gh/sdv-dev/CTGAN/branch/master/graph/badge.svg)](https://codecov.io/gh/sdv-dev/CTGAN)
1010

1111
# CTGAN
1212

1313
Implementation of our NeurIPS paper [Modeling Tabular data using Conditional GAN](https://arxiv.org/abs/1907.00503).
1414

1515
CTGAN is a GAN-based data synthesizer that can generate synthetic tabular data with high fidelity.
1616

17-
- Free software: [MIT license](https://github.com/DAI-Lab/CTGAN/tree/master/LICENSE)
18-
- Documentation: https://DAI-Lab.github.io/CTGAN
19-
- Homepage: https://github.com/DAI-Lab/CTGAN
17+
* License: [MIT](https://github.com/sdv-dev/CTGAN/blob/master/LICENSE)
18+
* Documentation: https://sdv-dev.github.io/CTGAN
19+
* Homepage: https://github.com/sdv-dev/CTGAN
2020

2121
## Overview
2222

23-
Based on previous work ([TGAN](https://github.com/DAI-Lab/TGAN)) on synthetic data generation,
23+
Based on previous work ([TGAN](https://github.com/sdv-dev/TGAN)) on synthetic data generation,
2424
we develop a new model called CTGAN. Several major differences make CTGAN outperform TGAN.
2525

2626
- **Preprocessing**: CTGAN uses more sophisticated Variational Gaussian Mixture Model to detect
@@ -49,7 +49,7 @@ pip install ctgan
4949
This will pull and install the latest stable release from [PyPI](https://pypi.org/).
5050

5151
If you want to install from source or contribute to the project please read the
52-
[Contributing Guide](https://DAI-Lab.github.io/CTGAN/contributing.html#get-started).
52+
[Contributing Guide](https://sdv-dev.github.io/CTGAN/contributing.html#get-started).
5353

5454
# Data Format
5555

@@ -179,13 +179,13 @@ must be rounded to integers in a later step, outside of CTGAN.
179179
# Join our community
180180

181181
1. If you would like to try more dataset examples, please have a look at the [examples folder](
182-
https://github.com/DAI-Lab/CTGAN/tree/master/examples) of the repository. Please contact us
182+
https://github.com/sdv-dev/CTGAN/tree/master/examples) of the repository. Please contact us
183183
if you have a usage example that you would want to share with the community.
184184
2. If you want to contribute to the project code, please head to the [Contributing Guide](
185-
https://DAI-Lab.github.io/CTGAN/contributing.html#get-started) for more details about how to do it.
185+
https://sdv-dev.github.io/CTGAN/contributing.html#get-started) for more details about how to do it.
186186
3. If you have any doubts, feature requests or detect an error, please [open an issue on github](
187-
https://github.com/DAI-Lab/CTGAN/issues)
188-
4. Also do not forget to check the [project documentation site](https://DAI-Lab.github.io/CTGAN/)!
187+
https://github.com/sdv-dev/CTGAN/issues)
188+
4. Also do not forget to check the [project documentation site](https://sdv-dev.github.io/CTGAN/)!
189189

190190

191191
# Citing TGAN
@@ -202,3 +202,15 @@ If you use CTGAN, please cite the following work:
202202
year={2019}
203203
}
204204
```
205+
206+
# Related Projects
207+
208+
## R interface for CTGAN
209+
210+
A wrapper around **CTGAN** has been implemented by Kevin Kuo @kevinykuo, bringing the functionalities
211+
of **CTGAN** to **R** users.
212+
213+
More details can be found in the corresponding repository: https://github.com/kasaai/ctgan
214+
215+
Please note that this package is an external contribution and is not maintained nor suporvised by
216+
the MIT DAI-Lab team.

ctgan/__init__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
__author__ = 'MIT Data To AI Lab'
66
__email__ = 'dailabmit@gmail.com'
7-
__version__ = '0.2.0'
7+
__version__ = '0.2.1.dev1'
88

99
from ctgan.demo import load_demo
1010
from ctgan.synthesizer import CTGANSynthesizer

ctgan/conditional.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33

44
class ConditionalGenerator(object):
5-
def __init__(self, data, output_info):
5+
def __init__(self, data, output_info, log_frequency):
66
self.model = []
77

88
start = 0
@@ -50,7 +50,8 @@ def __init__(self, data, output_info):
5050
continue
5151
end = start + item[0]
5252
tmp = np.sum(data[:, start:end], axis=0)
53-
tmp = np.log(tmp + 1)
53+
if log_frequency:
54+
tmp = np.log(tmp + 1)
5455
tmp = tmp / np.sum(tmp)
5556
self.p[self.n_col, :item[0]] = tmp
5657
self.interval.append((self.n_opt, item[0]))

ctgan/synthesizer.py

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,7 @@ def _cond_loss(self, data, c, m):
9595

9696
return (loss * m).sum() / data.size()[0]
9797

98-
def fit(self, train_data, discrete_columns=tuple(), epochs=300):
98+
def fit(self, train_data, discrete_columns=tuple(), epochs=300, log_frequency=True):
9999
"""Fit the CTGAN Synthesizer models to the training data.
100100
101101
Args:
@@ -109,6 +109,9 @@ def fit(self, train_data, discrete_columns=tuple(), epochs=300):
109109
a ``pandas.DataFrame``, this list should contain the column names.
110110
epochs (int):
111111
Number of training epochs. Defaults to 300.
112+
log_frequency (boolean):
113+
Whether to use log frequency of categorical levels in conditional
114+
sampling. Defaults to ``True``.
112115
"""
113116

114117
self.transformer = DataTransformer()
@@ -118,7 +121,11 @@ def fit(self, train_data, discrete_columns=tuple(), epochs=300):
118121
data_sampler = Sampler(train_data, self.transformer.output_info)
119122

120123
data_dim = self.transformer.output_dimensions
121-
self.cond_generator = ConditionalGenerator(train_data, self.transformer.output_info)
124+
self.cond_generator = ConditionalGenerator(
125+
train_data,
126+
self.transformer.output_info,
127+
log_frequency
128+
)
122129

123130
self.generator = Generator(
124131
self.embedding_dim + self.cond_generator.n_opt,
@@ -215,7 +222,8 @@ def fit(self, train_data, discrete_columns=tuple(), epochs=300):
215222
optimizerG.step()
216223

217224
print("Epoch %d, Loss G: %.4f, Loss D: %.4f" %
218-
(i + 1, loss_g.detach().cpu(), loss_d.detach().cpu()))
225+
(i + 1, loss_g.detach().cpu(), loss_d.detach().cpu()),
226+
flush=True)
219227

220228
def sample(self, n):
221229
"""Sample data similar to the training data.

docs/conf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@
6060
copyright = '2019, MIT Data To AI Lab'
6161
author = 'MIT Data To AI Lab'
6262
description = 'Conditional GAN for Tabular Data'
63-
user = 'DAI-Lab'
63+
user = 'sdv-dev'
6464

6565
# The version info for the project you're documenting, acts as replacement
6666
# for |version| and |release|, also used in various other places throughout

setup.cfg

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
[bumpversion]
2-
current_version = 0.2.0
2+
current_version = 0.2.1.dev1
33
commit = True
44
tag = True
55
parse = (?P<major>\d+)\.(?P<minor>\d+)\.(?P<patch>\d+)(\.(?P<release>[a-z]+)(?P<candidate>\d+))?

setup.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -93,7 +93,7 @@
9393
setup_requires=setup_requires,
9494
test_suite='tests',
9595
tests_require=tests_require,
96-
url='https://github.com/DAI-Lab/CTGAN',
97-
version='0.2.0',
96+
url='https://github.com/sdv-dev/CTGAN',
97+
version='0.2.1.dev1',
9898
zip_safe=False,
9999
)

0 commit comments

Comments
 (0)