Performance Enhancement

The package includes parallel processing for bootstrap methods. Unidirectional and bidirectional bootstrap sampling methods are the methods that benefit the most from parallel processing.

Usage

# Use parallel processing for faster bootstrap
result = RDSmean(
    x='income',
    data=rds_data,
    var_est='tree_uni1',
    resample_n=2000,
    n_cores=8  # Use 8 cores for parallel processing
)

Parallel processing is available for all bootstrap-based statistical functions:

  • RDSmean() with bootstrap variance estimation

  • RDStable() with bootstrap variance estimation

  • RDSlm() with bootstrap variance estimation

Performance Comparison

Performance Scaling

Cores

Bootstrap Samples

Standard Time

Parallel Time

Speedup

1

1000

120s

120s

1.0x

4

1000

120s

18s

6.7x

8

1000

120s

12s

10.0x

Examples

All estimation functions support the n_cores parameter:

# Parallel mean calculation
mean_result = RDSmean(
    x='age',
    data=rds_data,
    var_est='tree_uni1',
    resample_n=1000,
    n_cores=4
)

# Parallel table calculation
table_result = RDStable(
    x="Sex",
    y="Race",
    data=rds_data,
    var_est='tree_uni1',
    resample_n=1000,
    n_cores=4
)

# Parallel regression
regression_result = RDSlm(
    data=rds_data,
    formula="Age ~ Sex",
    var_est='tree_uni1',
    resample_n=1000,
    n_cores=4
)