Title: | Simulation of Respondent Driven Samples |
---|---|
Description: | Simulate populations with desired properties and extract respondent driven samples. To better understand the usage of the package and the algorithm used, please refer to Perera, A., and Ramanayake, A. (2019) <https://www.aimr.tirdiconference.com/assets/images/portfolio/Conference-Proceeding-AIMR-19.pdf>. |
Authors: | Ayesha Perera [aut, cre] |
Maintainer: | Ayesha Perera <[email protected]> |
License: | GPL-2 |
Version: | 2.0.0 |
Built: | 2024-11-10 08:00:03 UTC |
Source: | https://github.com/cran/SimRDS |
Generates RDS populations according to the user's preferred variability. Do note the computer capacities when using the function since it can affect the functionality of the methods used. A population size of 10,000 can be produced with a RAM of 4GB without an issue. But if higher population sizes are needed, then higher RAM is needed. The produced population consists of the degree size, a dichotomous independent variable, a dichotomous response variable and the IDs of the respondents in the network of each response.
population( N = 1000, p.ties = 0.33, minVal = 0, maxVal = 300, zeros = 2, dis_type = "rexp", skew = 0.05, pr = 0.33, pa = 0.33, atype_char = "NULL", atype_res = "NULL" )
population( N = 1000, p.ties = 0.33, minVal = 0, maxVal = 300, zeros = 2, dis_type = "rexp", skew = 0.05, pr = 0.33, pa = 0.33, atype_char = "NULL", atype_res = "NULL" )
N |
Population size. |
p.ties |
Number of ties to be in the population. |
minVal |
The minimum degree size an individual in the population can have. |
maxVal |
The maximum degree size an individual in the population can have. |
zeros |
The maximum number of zeros the population can have |
dis_type |
Specify the base distribution of the degrees. ("rnorm, rexp or runif") |
skew |
is needed when the distribution is left- or right-skewed. When selecting the value for the skewness, it is advisable to first observe the range of values given from the rep function and to see whether the maximum value of the distribution is close to the defined maximal. |
pr |
Proportion of individuals that has '1' as the response in the response variable. |
pa |
Proportion of individuals that has '1' as their character in the independent variable. |
atype_char |
Defines how the independent variable is associated with other external factors. "NULL" when it needs to be randomly distributed. If it needs to be associated with the network size; use "net". It can be associated with network size only. You must note these abbreviations when you associate them with these variables: "network size = net." |
atype_res |
Defines how the response variable is associated with other external factors."NULL" is to be randomly distributed. If it needs to be associated with both independent v; use "char * net". Can associate with the network size and the independent variable abbreviations when needed to associate with these variables. "network size = net", "independent dichotomous variable = char." |
A simulated population with desired characteristics.
RDS.population <- population(N = 1000,p.ties = 0.6,minVal = 10,zeros = 0, pr = .5,pa = .1, atype_char = "net",atype_res = "net*char") # This function may take considerable time to produce output, depending on # the computer's performance. For a quick reference to the expected result, # a saved version of the output is available. data(RDS.population)
RDS.population <- population(N = 1000,p.ties = 0.6,minVal = 10,zeros = 0, pr = .5,pa = .1, atype_char = "net",atype_res = "net*char") # This function may take considerable time to produce output, depending on # the computer's performance. For a quick reference to the expected result, # a saved version of the output is available. data(RDS.population)
population
function.This list contains the output of the function population
so that it
could be used in the sampling examples. Depending on the performance of the
computer it could take a considerable amount of time to produce this output.
RDS.population
RDS.population
An object of class list
of length 4.
samplingEx
function.The function samplingEx
may take considerable time to produce output,
depending on the computer's performance. For a quick reference to the expected
result, a saved version of the output is available.
RDS.samplingEx
RDS.samplingEx
An object of class rds.data.frame
(inherits from data.frame
) with 300 rows and 8 columns.
samplingP
function.The function samplingP
may take considerable time to produce output,
depending on the computer's performance. For a quick reference to the expected
result, a saved version of the output is available.
RDS.samplingP
RDS.samplingP
An object of class rds.data.frame
(inherits from data.frame
) with 1899 rows and 7 columns.
Extract samples from populations that the user inputs from an external source without using the inbuilt function 'population' to simulate a population. Data should be in .csv format. Inputs are recruited in the format of the output given in the inbuilt function 'population.'
samplingEx( population, response_column, degree_column, categorical_column, id_column, degree = NULL, net = NULL, seeds = 10, coupon = 2, waves = NULL, size = 300, prob = NULL, prop_sam = NULL, showids = TRUE, timeInterval = NULL )
samplingEx( population, response_column, degree_column, categorical_column, id_column, degree = NULL, net = NULL, seeds = 10, coupon = 2, waves = NULL, size = 300, prob = NULL, prop_sam = NULL, showids = TRUE, timeInterval = NULL )
population |
.csv file containing the population details. The first column should contain the individual's ID. |
response_column |
Column including responses. |
degree_column |
Column including degrees. |
categorical_column |
Column including categories. |
id_column |
Column including IDs. |
degree |
.csv file containing the network sizes. If it's not provided, the degree size would be the number of individuals in the networks provided. If both the networks and the degrees are not provided, then it will randomly produce degrees for each respondent. |
net |
.csv file contains the individual's network. It should contain the same format as the output of the 'net size table' given from the output of the inbuilt function 'population'. If not provided, would randomly assign individuals considering the degree. |
seeds |
Number of seeds. It should be a positive integer. |
coupon |
Number of coupons. It should be a positive integer. |
waves |
Number of waves. It should be a positive integer. |
size |
Sample size. It should be a positive integer. |
prob |
Whether the seeds are selected probabilistically. If you want to be selected randomly then leave it as 'NULL' If you want to relate it to a variable, add the variable name, and it should be related. For example, if the seeds selected should depend on the network size, then the 'network' should be used. If it is inversely proportional, then it should enter as '1/network'. |
prop_sam |
If the individuals from the network of an individual should be chosen randomly use "NULL". If the selection should be done probabilistically, state the variable and how it should be related. |
showids |
If TRUE, it displays the IDs being recruited. |
timeInterval |
Days between recruitment waves. Zeroth wave is assumed to recruit at the present day. |
RDS sample
RDS.population <- population(N = 1000,p.ties = 0.6,minVal = 10,zeros = 0, pr = .5,pa = .1, atype_char = "net",atype_res = "net*char") # This should be replaced with a real world dataset. RDS.samplingEx <- samplingEx(population=RDS.population$frame, response_column="response", degree_column="network", categorical_column="character", id_column="s", seeds = 40, coupon = 2, waves =8) # This function may take considerable time to produce output, depending on # the computer's performance. For a quick reference to the expected result, # a saved version of the output is available. data(RDS.samplingEx)
RDS.population <- population(N = 1000,p.ties = 0.6,minVal = 10,zeros = 0, pr = .5,pa = .1, atype_char = "net",atype_res = "net*char") # This should be replaced with a real world dataset. RDS.samplingEx <- samplingEx(population=RDS.population$frame, response_column="response", degree_column="network", categorical_column="character", id_column="s", seeds = 40, coupon = 2, waves =8) # This function may take considerable time to produce output, depending on # the computer's performance. For a quick reference to the expected result, # a saved version of the output is available. data(RDS.samplingEx)
Extract samples from the generated populations using the 'SimRDS::population' function. Can directly use this output to the RDS estimates in the RDS package.
samplingP( population, seeds, coupon, waves = NULL, size = NULL, prob = NULL, character_sam = NULL, showids = TRUE, timeInterval = NULL )
samplingP( population, seeds, coupon, waves = NULL, size = NULL, prob = NULL, character_sam = NULL, showids = TRUE, timeInterval = NULL )
population |
Population generated using the 'SimRDS::population' function. |
seeds |
Number of seeds. |
coupon |
Number of coupons. |
waves |
Number of waves. |
size |
Sample size. |
prob |
Whether the seeds are selected probabilistically. If you want to be selected randomly, leave it as 'NULL' If you want to relate it to a variable, add the variable name. For example, if the seeds selected depend on the network size, then 'network' should be used. It should enter as '1/network' if it is inversely proportional. |
character_sam |
If the individuals from the network of an individual should be chosen randomly use "NULL". If the selection should be done probabilistically, state the variable and how it should be related. Possible inputs are NULL, netSize, character. |
showids |
If TRUE, it displays the IDs being recruited. |
timeInterval |
Days between recruitment waves. Zeroth wave is assumed to recruit at the present day. |
RDS sample
RDS.population <- population(N = 1000,p.ties = 0.6,minVal = 10,zeros = 0, pr = .5,pa = .1, atype_char = "net",atype_res = "net*char") RDS.samplingP <- samplingP(population=RDS.population, seeds = 40, coupon = 2, waves = 8, prob = '1/network', character_sam = 'netSize') # This function may take considerable time to produce output, depending on # the computer's performance. For a quick reference to the expected result, # a saved version of the output is available. data(RDS.samplingP)
RDS.population <- population(N = 1000,p.ties = 0.6,minVal = 10,zeros = 0, pr = .5,pa = .1, atype_char = "net",atype_res = "net*char") RDS.samplingP <- samplingP(population=RDS.population, seeds = 40, coupon = 2, waves = 8, prob = '1/network', character_sam = 'netSize') # This function may take considerable time to produce output, depending on # the computer's performance. For a quick reference to the expected result, # a saved version of the output is available. data(RDS.samplingP)