Package 'SimRDS' reference manual

Title:	Simulation of Respondent Driven Samples
Description:	Simulate populations with desired properties and extract respondent driven samples. To better understand the usage of the package and the algorithm used, please refer to Perera, A., and Ramanayake, A. (2019) <https://www.aimr.tirdiconference.com/assets/images/portfolio/Conference-Proceeding-AIMR-19.pdf>.
Authors:	Ayesha Perera [aut, cre]
Maintainer:	Ayesha Perera <[email protected]>
License:	GPL-2
Version:	2.0.0
Built:	2025-02-08 05:39:05 UTC
Source:	https://github.com/cran/SimRDS

Generates a RDS Population

Description

Generates RDS populations according to the user's preferred variability. Do note the computer capacities when using the function since it can affect the functionality of the methods used. A population size of 10,000 can be produced with a RAM of 4GB without an issue. But if higher population sizes are needed, then higher RAM is needed. The produced population consists of the degree size, a dichotomous independent variable, a dichotomous response variable and the IDs of the respondents in the network of each response.

Usage

population(
  N = 1000,
  p.ties = 0.33,
  minVal = 0,
  maxVal = 300,
  zeros = 2,
  dis_type = "rexp",
  skew = 0.05,
  pr = 0.33,
  pa = 0.33,
  atype_char = "NULL",
  atype_res = "NULL"
)
population(
  N = 1000,
  p.ties = 0.33,
  minVal = 0,
  maxVal = 300,
  zeros = 2,
  dis_type = "rexp",
  skew = 0.05,
  pr = 0.33,
  pa = 0.33,
  atype_char = "NULL",
  atype_res = "NULL"
)

Arguments

`N`	Population size.
`p.ties`	Number of ties to be in the population.
`minVal`	The minimum degree size an individual in the population can have.
`maxVal`	The maximum degree size an individual in the population can have.
`zeros`	The maximum number of zeros the population can have
`dis_type`	Specify the base distribution of the degrees. ("rnorm, rexp or runif")
`skew`	is needed when the distribution is left- or right-skewed. When selecting the value for the skewness, it is advisable to first observe the range of values given from the rep function and to see whether the maximum value of the distribution is close to the defined maximal.
`pr`	Proportion of individuals that has '1' as the response in the response variable.
`pa`	Proportion of individuals that has '1' as their character in the independent variable.
`atype_char`	Defines how the independent variable is associated with other external factors. "NULL" when it needs to be randomly distributed. If it needs to be associated with the network size; use "net". It can be associated with network size only. You must note these abbreviations when you associate them with these variables: "network size = net."
`atype_res`	Defines how the response variable is associated with other external factors."NULL" is to be randomly distributed. If it needs to be associated with both independent v; use "char * net". Can associate with the network size and the independent variable abbreviations when needed to associate with these variables. "network size = net", "independent dichotomous variable = char."

Value

A simulated population with desired characteristics.

Examples


RDS.population <- population(N = 1000,p.ties = 0.6,minVal = 10,zeros = 0,
pr = .5,pa = .1, atype_char = "net",atype_res = "net*char")

# This function may take considerable time to produce output, depending on
# the computer's performance. For a quick reference to the expected result,
# a saved version of the output is available.
data(RDS.population)
RDS.population <- population(N = 1000,p.ties = 0.6,minVal = 10,zeros = 0,
pr = .5,pa = .1, atype_char = "net",atype_res = "net*char")

# This function may take considerable time to produce output, depending on
# the computer's performance. For a quick reference to the expected result,
# a saved version of the output is available.
data(RDS.population)

The list that specifies the output of the `population` function.

Description

This list contains the output of the function population so that it could be used in the sampling examples. Depending on the performance of the computer it could take a considerable amount of time to produce this output.

Usage

RDS.population
RDS.population

Format

An object of class list of length 4.

The dataframe that specifies the output of the `samplingEx` function.

Description

The function samplingEx may take considerable time to produce output, depending on the computer's performance. For a quick reference to the expected result, a saved version of the output is available.

Usage

RDS.samplingEx
RDS.samplingEx

Format

An object of class rds.data.frame (inherits from data.frame) with 300 rows and 8 columns.

The dataframe that specifies the output of the `samplingP` function.

Description

The function samplingP may take considerable time to produce output, depending on the computer's performance. For a quick reference to the expected result, a saved version of the output is available.

Usage

RDS.samplingP
RDS.samplingP

Format

An object of class rds.data.frame (inherits from data.frame) with 1899 rows and 7 columns.

Extracts samples from external populations

Description

Extract samples from populations that the user inputs from an external source without using the inbuilt function 'population' to simulate a population. Data should be in .csv format. Inputs are recruited in the format of the output given in the inbuilt function 'population.'

Usage

samplingEx(
  population,
  response_column,
  degree_column,
  categorical_column,
  id_column,
  degree = NULL,
  net = NULL,
  seeds = 10,
  coupon = 2,
  waves = NULL,
  size = 300,
  prob = NULL,
  prop_sam = NULL,
  showids = TRUE,
  timeInterval = NULL
)
samplingEx(
  population,
  response_column,
  degree_column,
  categorical_column,
  id_column,
  degree = NULL,
  net = NULL,
  seeds = 10,
  coupon = 2,
  waves = NULL,
  size = 300,
  prob = NULL,
  prop_sam = NULL,
  showids = TRUE,
  timeInterval = NULL
)

Arguments

`population`	.csv file containing the population details. The first column should contain the individual's ID.
`response_column`	Column including responses.
`degree_column`	Column including degrees.
`categorical_column`	Column including categories.
`id_column`	Column including IDs.
`degree`	.csv file containing the network sizes. If it's not provided, the degree size would be the number of individuals in the networks provided. If both the networks and the degrees are not provided, then it will randomly produce degrees for each respondent.
`net`	.csv file contains the individual's network. It should contain the same format as the output of the 'net size table' given from the output of the inbuilt function 'population'. If not provided, would randomly assign individuals considering the degree.
`seeds`	Number of seeds. It should be a positive integer.
`coupon`	Number of coupons. It should be a positive integer.
`waves`	Number of waves. It should be a positive integer.
`size`	Sample size. It should be a positive integer.
`prob`	Whether the seeds are selected probabilistically. If you want to be selected randomly then leave it as 'NULL' If you want to relate it to a variable, add the variable name, and it should be related. For example, if the seeds selected should depend on the network size, then the 'network' should be used. If it is inversely proportional, then it should enter as '1/network'.
`prop_sam`	If the individuals from the network of an individual should be chosen randomly use "NULL". If the selection should be done probabilistically, state the variable and how it should be related.
`showids`	If TRUE, it displays the IDs being recruited.
`timeInterval`	Days between recruitment waves. Zeroth wave is assumed to recruit at the present day.

Value

RDS sample

Examples


RDS.population <- population(N = 1000,p.ties = 0.6,minVal = 10,zeros = 0,
pr = .5,pa = .1, atype_char = "net",atype_res = "net*char")

# This should be replaced with a real world dataset.
RDS.samplingEx <- samplingEx(population=RDS.population$frame,
response_column="response", degree_column="network",
categorical_column="character", id_column="s", seeds = 40, coupon = 2,
waves =8)


# This function may take considerable time to produce output, depending on
# the computer's performance. For a quick reference to the expected result,
# a saved version of the output is available.
data(RDS.samplingEx)

RDS.population <- population(N = 1000,p.ties = 0.6,minVal = 10,zeros = 0,
pr = .5,pa = .1, atype_char = "net",atype_res = "net*char")

# This should be replaced with a real world dataset.
RDS.samplingEx <- samplingEx(population=RDS.population$frame,
response_column="response", degree_column="network",
categorical_column="character", id_column="s", seeds = 40, coupon = 2,
waves =8)


# This function may take considerable time to produce output, depending on
# the computer's performance. For a quick reference to the expected result,
# a saved version of the output is available.
data(RDS.samplingEx)

Extract samples from generated populations.

Description

Extract samples from the generated populations using the 'SimRDS::population' function. Can directly use this output to the RDS estimates in the RDS package.

Usage

samplingP(
  population,
  seeds,
  coupon,
  waves = NULL,
  size = NULL,
  prob = NULL,
  character_sam = NULL,
  showids = TRUE,
  timeInterval = NULL
)
samplingP(
  population,
  seeds,
  coupon,
  waves = NULL,
  size = NULL,
  prob = NULL,
  character_sam = NULL,
  showids = TRUE,
  timeInterval = NULL
)

Arguments

`population`	Population generated using the 'SimRDS::population' function.
`seeds`	Number of seeds.
`coupon`	Number of coupons.
`waves`	Number of waves.
`size`	Sample size.
`prob`	Whether the seeds are selected probabilistically. If you want to be selected randomly, leave it as 'NULL' If you want to relate it to a variable, add the variable name. For example, if the seeds selected depend on the network size, then 'network' should be used. It should enter as '1/network' if it is inversely proportional.
`character_sam`	If the individuals from the network of an individual should be chosen randomly use "NULL". If the selection should be done probabilistically, state the variable and how it should be related. Possible inputs are NULL, netSize, character.
`showids`	If TRUE, it displays the IDs being recruited.
`timeInterval`	Days between recruitment waves. Zeroth wave is assumed to recruit at the present day.

Value

RDS sample

Examples


RDS.population <- population(N = 1000,p.ties = 0.6,minVal = 10,zeros = 0,
pr = .5,pa = .1, atype_char = "net",atype_res = "net*char")

RDS.samplingP <- samplingP(population=RDS.population, seeds = 40, coupon = 2,
waves = 8, prob = '1/network', character_sam = 'netSize')


# This function may take considerable time to produce output, depending on
# the computer's performance. For a quick reference to the expected result,
# a saved version of the output is available.
data(RDS.samplingP)

RDS.population <- population(N = 1000,p.ties = 0.6,minVal = 10,zeros = 0,
pr = .5,pa = .1, atype_char = "net",atype_res = "net*char")

RDS.samplingP <- samplingP(population=RDS.population, seeds = 40, coupon = 2,
waves = 8, prob = '1/network', character_sam = 'netSize')


# This function may take considerable time to produce output, depending on
# the computer's performance. For a quick reference to the expected result,
# a saved version of the output is available.
data(RDS.samplingP)

Package 'SimRDS'

Help Index

Generates a RDS Population

Description

Usage

Arguments

Value

Examples

The list that specifies the output of the population function.

Description

Usage

Format

The dataframe that specifies the output of the samplingEx function.

Description

Usage

Format

The dataframe that specifies the output of the samplingP function.

Description

Usage

Format

Extracts samples from external populations

Description

Usage

Arguments

Value

Examples

Extract samples from generated populations.

Description

Usage

Arguments

Value

Examples

The list that specifies the output of the `population` function.

The dataframe that specifies the output of the `samplingEx` function.

The dataframe that specifies the output of the `samplingP` function.