Title: | Techniques for Automated Classifiers |
---|---|
Description: | A set of techniques that can be used to develop, validate, and implement automated classifiers. A powerful tool for transforming raw data into meaningful information, 'ncodeR' (Shaffer, D. W. (2017) Quantitative Ethnography. ISBN: 0578191687) is designed specifically for working with big data: large document collections, logfiles, and other text data. |
Authors: | Cody L Marquart [aut, cre]
|
Maintainer: | Cody L Marquart <[email protected]> |
License: | GPL-3 | file LICENSE |
Version: | 0.2.0.1 |
Built: | 2025-02-22 03:24:01 UTC |
Source: | https://github.com/cran/ncodeR |
Title
## S3 method for class 'Code' as.data.frame(x, row.names = NULL, optional = FALSE, ...)
## S3 method for class 'Code' as.data.frame(x, row.names = NULL, optional = FALSE, ...)
x |
Code object to convert |
row.names |
NULL or a character vector giving the row names for the data frame. Missing values are not allowed. |
optional |
logical. If TRUE, setting row names and converting column names |
... |
additional arguments to be passed to or from methods |
data.frame
data(RS.data) rs = RS.data newcode = create.code(name = "Data", expressions = c("number","data"), excerpts = rs$text) as.data.frame(newcode)
data(RS.data) rs = RS.data newcode = create.code(name = "Data", expressions = c("number","data"), excerpts = rs$text) as.data.frame(newcode)
Title
## S3 method for class 'CodeSet' as.data.frame(x, row.names = NULL, optional = FALSE, ...)
## S3 method for class 'CodeSet' as.data.frame(x, row.names = NULL, optional = FALSE, ...)
x |
CodeSet to convert |
row.names |
NULL or a character vector giving the row names for the data frame. Missing values are not allowed. |
optional |
logical. If TRUE, setting row names and converting column names |
... |
additional arguments to be passed to or from methods |
data.frame
data(RS.data) rs = RS.data newcode = create.code(name = "Data", expressions = c("number","data"), excerpts = rs$text) code.set = code.set("Demo RS CodeSet", "CodeSet made for the demo", excerpts = rs$text, codes = c(newcode)) as.data.frame(code.set)
data(RS.data) rs = RS.data newcode = create.code(name = "Data", expressions = c("number","data"), excerpts = rs$text) code.set = code.set("Demo RS CodeSet", "CodeSet made for the demo", excerpts = rs$text, codes = c(newcode)) as.data.frame(code.set)
Autocodes all codes provided, either directly with code or as part of a provided codeset
autocode(x = NULL, expressions = NULL, excerpts = NULL, simplify = T, mode = "all")
autocode(x = NULL, expressions = NULL, excerpts = NULL, simplify = T, mode = "all")
x |
Object to autocode. Either a Code or CodeSet |
expressions |
Expressions to use for coding (optional) |
excerpts |
Excerpts to code |
simplify |
If TRUE, returns a data.frame, else returns a Code or CodeSet object |
mode |
Either all, training, or test representing the set of excerpts that should be recoded in the computerSet |
data.frame of is simplify = T (default), otherwise the Code or CodeSet object with updated computerSets
Create a new CodeSet object
code.set(title = "", description = "", excerpts = c(), codes = c())
code.set(title = "", description = "", excerpts = c(), codes = c())
title |
Title for the CodeSet |
description |
Description of the CodeSet |
excerpts |
Set of excerpts to use with the CodeSet |
codes |
Set of codes to attach to the CodeSet |
CodeSet object
data(RS.data) rs = RS.data code.set = code.set("Demo RS CodeSet", "CodeSet made for the demo", excerpts = rs$text, codes = c())
data(RS.data) rs = RS.data code.set = code.set("Demo RS CodeSet", "CodeSet made for the demo", excerpts = rs$text, codes = c())
Object representing a set of codes
CodeSet
CodeSet
An object of class R6ClassGenerator
of length 24.
CodeSet object
CodeSet
title
Title of the CodeSet
description
String description of the set of codes to be included
excerpts
Character vector of text excerpts to code (optional)
expressions
Codes to include in the CodeSet (optional)
data(RS.data) rs = RS.data code.set = code.set("Demo RS CodeSet", "CodeSet made for the demo", excerpts = rs$text, codes = c())
data(RS.data) rs = RS.data code.set = code.set("Demo RS CodeSet", "CodeSet made for the demo", excerpts = rs$text, codes = c())
Create a code
create.code(name = "NewCode", definition = NULL, excerpts = NULL, type = "Regex", ...)
create.code(name = "NewCode", definition = NULL, excerpts = NULL, type = "Regex", ...)
name |
Name of the code |
definition |
Definition of the Code |
excerpts |
Character vectore of excerpts to use for Coding |
type |
Character string representing the type of code (Default: "Regex") |
... |
Additional parameters |
Code object
data(RS.data) rs = RS.data # Generate a Code newcode = create.code(name = "Data", expressions = c("number","data"), excerpts = rs$text)
data(RS.data) rs = RS.data # Generate a Code newcode = create.code(name = "Data", expressions = c("number","data"), excerpts = rs$text)
Find rows that differ within a data.frame or two vectors
differences(code = NULL, wh = "trainingSet", to = "computerSet")
differences(code = NULL, wh = "trainingSet", to = "computerSet")
code |
Code object to search for differences |
wh |
Set to use as the base comparison |
to |
Set to compare wh to |
Find rows that differ within a data.frame or two vectors
logical vector representing indices that are coded differently
vector of indices representing differences
Match a set of text excerpts against a set of regular expressions
expression.match(excerpts, expressions, names = list(NULL, "V1"))
expression.match(excerpts, expressions, names = list(NULL, "V1"))
excerpts |
Character vector to match against |
expressions |
Character vector of expressions |
names |
Character vector to use for dimension names |
Matrix representing matched expressions
Handset indices
getHandSetIndices(codeToUse, handSetLength = 20, handSetBaserate = 0.2, unseen = F)
getHandSetIndices(codeToUse, handSetLength = 20, handSetBaserate = 0.2, unseen = F)
codeToUse |
[TBD] |
handSetLength |
[TBD] |
handSetBaserate |
[TBD] |
unseen |
[TBD] |
Get indices to code
getHandSetIndices2(code, handSetLength = 20, handSetBaserate = 0.2, unseen = F, this.set = NULL)
getHandSetIndices2(code, handSetLength = 20, handSetBaserate = 0.2, unseen = F, this.set = NULL)
code |
Code object |
handSetLength |
Number of excerpts to put into the test set |
handSetBaserate |
Minimum number of positives that should be in the test set |
unseen |
[TBD] |
this.set |
[TBD] |
Code object with an updated test set and computer set
Handcode a set of excerpts using a vector of expressions
handcode(code = NULL, excerpts = NULL, expressions = NULL, n = ifelse(is.null(this.set), 10, length(this.set)), baserate = 0.2, unseen = F, this.set = NULL, results = NULL)
handcode(code = NULL, excerpts = NULL, expressions = NULL, n = ifelse(is.null(this.set), 10, length(this.set)), baserate = 0.2, unseen = F, this.set = NULL, results = NULL)
code |
Code object to handcode |
excerpts |
Excerpts to code (optional) |
expressions |
Expressions to code with (options) |
n |
Number of excerpts to handcode |
baserate |
Value between 0 and 1, inflates the baserate chosen excerpts to code, ensuring the number of positive at least equal to n * baserate |
unseen |
Logical or number Indicating additional excerpts with unseen words should be added. If TRUE (default), two words added or by 'number' |
this.set |
[TBD] |
results |
[TBD] |
Handcode a set of excerpts using a vector of expressions
Code
Wrapper for the entire coding process
ncode()
ncode()
Run tests (kappa, rho) on the given Code
old_test(code, kappaThreshold = 0.65, baserateInflation = 0.2, type = c("training", "test"))
old_test(code, kappaThreshold = 0.65, baserateInflation = 0.2, type = c("training", "test"))
code |
Code object to test |
kappaThreshold |
Threshold used for calculating rhoR::rho |
baserateInflation |
inflation rate to use when sampling handsets |
type |
vector indicating which stats should be calculated |
Code object with updated statistics property
Print a Code summary
## S3 method for class 'summary.Code' print(x, ...)
## S3 method for class 'summary.Code' print(x, ...)
x |
list from summary() |
... |
Additional parameters |
Prints code summary
data(RS.data) rs = RS.data newcode = create.code(name = "Data", expressions = c("number","data"), excerpts = rs$text) summary(newcode)
data(RS.data) rs = RS.data newcode = create.code(name = "Data", expressions = c("number","data"), excerpts = rs$text) summary(newcode)
Print the summary of a CodeSet
## S3 method for class 'summary.CodeSet' print(x, ...)
## S3 method for class 'summary.CodeSet' print(x, ...)
x |
Summary of a CodeSet |
... |
Additional parameters |
prints summary
data(RS.data) rs = RS.data newcode = create.code(name = "Data", expressions = c("number","data"), excerpts = rs$text) code.set = code.set("Demo RS CodeSet", "CodeSet made for the demo", excerpts = rs$text, codes = c(newcode)) summary(code.set)
data(RS.data) rs = RS.data newcode = create.code(name = "Data", expressions = c("number","data"), excerpts = rs$text) code.set = code.set("Demo RS CodeSet", "CodeSet made for the demo", excerpts = rs$text, codes = c(newcode)) summary(code.set)
Print a TestList summary
## S3 method for class 'summary.TestList' print(x, ...)
## S3 method for class 'summary.TestList' print(x, ...)
x |
list from summary() |
... |
Additional parameters |
prints summary
data(RS.data) rs = RS.data newcode <- create.code("Data", expressions = c("number","data"), excerpts = rs$text) newcode <- handcode(newcode, this.set = 10:15, results = 0) newcode = test(code = newcode, kappa_threshold = 0.65) summary(newcode$statistics)
data(RS.data) rs = RS.data newcode <- create.code("Data", expressions = c("number","data"), excerpts = rs$text) newcode <- handcode(newcode, this.set = 10:15, results = 0) newcode = test(code = newcode, kappa_threshold = 0.65) summary(newcode$statistics)
Creates an object for Regular Expression coding. No need to call this directly, create.code is a nice wrapper around this and any other types of Codes
RegexCode
RegexCode
An object of class R6ClassGenerator
of length 24.
RegexCode object
name
Name of the Code
definition
Definition of the Code
excerpts
Character vector of text excerpts to code
...
Additional parameters not specific to a RegexCode
expressions
Character vector of regular expressions
data(RS.data) rs = RS.data # Generate a Code newcode = RegexCode$new(name = "New Code", definition = "Some definition", excerpts = rs$text, expressions = c("number","data"))
data(RS.data) rs = RS.data # Generate a Code newcode = RegexCode$new(name = "New Code", definition = "Some definition", excerpts = rs$text, expressions = c("number","data"))
Resolve differing results
resolve(code = NULL, trainingSet = NULL, computerSet = NULL, expressions = NULL, excerpts = NULL, ignored = NULL)
resolve(code = NULL, trainingSet = NULL, computerSet = NULL, expressions = NULL, excerpts = NULL, ignored = NULL)
code |
Code to resolve coding differences |
trainingSet |
Optionally provide a trainingSet, default: code$trainingSet |
computerSet |
Optionally provide a computerSet, default: code$computerSet |
expressions |
Optionally provide a set of expressions, default: code$expressions |
excerpts |
Optionally provide a set of excerpts, default: code$excerpts |
ignored |
Optionally proivde a set of excerpts to ignore during the resolve cycle loop |
A dataset containing sample chat data from the Rescushell Virtual Internship
RS.data
RS.data
An object of class data.frame
with 3824 rows and 20 columns.
Obtain summary of a Code object
## S3 method for class 'Code' summary(object, ...)
## S3 method for class 'Code' summary(object, ...)
object |
Code to summarize |
... |
Additional parameters |
List of Code summary
data(RS.data) rs = RS.data newcode = create.code(name = "Data", expressions = c("number","data"), excerpts = rs$text) summary(newcode)
data(RS.data) rs = RS.data newcode = create.code(name = "Data", expressions = c("number","data"), excerpts = rs$text) summary(newcode)
Obtain a summary of the CodeSet
## S3 method for class 'CodeSet' summary(object, ...)
## S3 method for class 'CodeSet' summary(object, ...)
object |
CodeSet object |
... |
Additional parameters |
list containing description and Code summaries
data(RS.data) rs = RS.data newcode = create.code(name = "Data", expressions = c("number","data"), excerpts = rs$text) code.set = code.set("Demo RS CodeSet", "CodeSet made for the demo", excerpts = rs$text, codes = c(newcode)) summary(code.set)
data(RS.data) rs = RS.data newcode = create.code(name = "Data", expressions = c("number","data"), excerpts = rs$text) code.set = code.set("Demo RS CodeSet", "CodeSet made for the demo", excerpts = rs$text, codes = c(newcode)) summary(code.set)
Obtain a summary of a Code's test results
## S3 method for class 'TestList' summary(object, ...)
## S3 method for class 'TestList' summary(object, ...)
object |
TestList object of Code |
... |
Additional parameters |
list of Test summary
data(RS.data) rs = RS.data newcode = create.code(name = "Data", expressions = c("number","data"), excerpts = rs$text) newcode <- handcode(newcode, this.set = 10:15, results = 0) newcode = test(code = newcode, kappa_threshold = 0.65) summary(newcode$statistics)
data(RS.data) rs = RS.data newcode = create.code(name = "Data", expressions = c("number","data"), excerpts = rs$text) newcode <- handcode(newcode, this.set = 10:15, results = 0) newcode = test(code = newcode, kappa_threshold = 0.65) summary(newcode$statistics)
Title
test(code, kappa_threshold = 0.65, baserate_inflation = 0.2, ...)
test(code, kappa_threshold = 0.65, baserate_inflation = 0.2, ...)
code |
[TBD] |
kappa_threshold |
[TBD] |
baserate_inflation |
[TBD] |
... |
[TBD] |
code object