基本創造因素

因子是在 R 中表示分類變數的一種方式。因子在內部儲存為整數向量。提供的字元向量的唯一元素稱為因子的級別。預設情況下,如果使用者未提供級別,則 R 將在向量中生成一組唯一值,按字母數字順序對這些值進行排序,並將它們用作級別。

 charvar <- rep(c("n", "c"), each = 3)
 f <- factor(charvar)
 f
 levels(f)

> f
[1] n n n c c c
Levels: c n
> levels(f)
[1] "c" "n"

如果要更改級別的順序,則可以選擇手動指定級別:

levels(factor(charvar, levels = c("n","c")))

> levels(factor(charvar, levels = c("n","c")))
[1] "n" "c"

因素有很多屬性。例如,級別可以給標籤:

> f <- factor(charvar, levels=c("n", "c"), labels=c("Newt", "Capybara"))
> f
[1] Newt     Newt     Newt     Capybara Capybara Capybara
Levels: Newt Capybara

可以分配的另一個屬性是該因子是否有序:

> Weekdays <- factor(c("Monday", "Wednesday", "Thursday", "Tuesday", "Friday", "Sunday", "Saturday"))
> Weekdays
[1] Monday    Wednesday Thursday  Tuesday   Friday    Sunday    Saturday 
Levels: Friday Monday Saturday Sunday Thursday Tuesday Wednesday
> Weekdays <- factor(Weekdays, levels=c("Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday"), ordered=TRUE)
> Weekdays
[1] Monday    Wednesday Thursday  Tuesday   Friday    Sunday    Saturday 
Levels: Monday < Tuesday < Wednesday < Thursday < Friday < Saturday < Sunday

當不再使用因子級別時,可以使用 droplevels() 函式將其刪除:

> Weekend <- subset(Weekdays, Weekdays == "Saturday" |  Weekdays == "Sunday")
> Weekend
[1] Sunday   Saturday
Levels: Monday < Tuesday < Wednesday < Thursday < Friday < Saturday < Sunday
> Weekend <- droplevels(Weekend)
> Weekend
[1] Sunday   Saturday
Levels: Saturday < Sunday