Tensorflow accuracy stuck at 25% for 4 labels, text classification
up vote
1
down vote
favorite
The accuracy starts off at around 40% and drops down during one epoch to 25%
My model:
self._model = keras.Sequential()
self._model.add(keras.layers.Dense(12, activation=tf.nn.sigmoid)) # hidden layer
self._model.add(keras.layers.Dense(len(VCDNN.conventions), activation=tf.nn.softmax)) # output layer
optimizer = tf.train.AdamOptimizer(0.01)
self._model.compile(optimizer, loss=tf.losses.sparse_softmax_cross_entropy, metrics=["accuracy"])
I have 4 labels, 60k rows of data, split evenly for each label so 15k each and 20k rows of data for evaluation
my data example:
name label
abcTest label1
mete_Test label2
ROMOBO label3
test label4
The input is turned into integers for each character and then hot encoded and output is just turned into integers [0-3]
1 epoch evaluation (loss, acc):
[0.7436684370040894, 0.25]
UPDATE
More details about the data
The strings are of up to 20 characters
I first convert each character to int based on an alphabet dictionary (a: 1, b:2, c:3) and if a word is shorter than 20 chars i fill the rest with 0's now those values are hot encoded and reshaped so
assume max 5 characters
1. ["abc","d"]
2. [[1,2,3,0,0],[4,0,0,0,0]]
3. [[[0,1,0,0,0],[0,0,1,0,0],[0,0,0,1,0],[1,0,0,0,0],[1,0,0,0,0]],[[0,0,0,0,1],[1,0,0,0,0],[1,0,0,0,0],[1,0,0,0,0],[1,0,0,0,0]]]
4. [[0,1,0,0,0,0,0,1,0,0,0,0,0,1,0,1,0,0,0,0,1,0,0,0,0],[0,0,0,0,1,1,0,0,0,0,1,0,0,0,0,1,0,0,0,0,1,0,0,0,0]]
and labels describe the way a word is spelled basically naming convention e.g. all lowercase - unicase, testBest - camelCase, TestTest - PascalCase, test_test - snake_case
With added 2 extra layers and LR reduced to 0.001
Pic of training
Update 2
self._model = keras.Sequential()
self._model.add(
keras.layers.Embedding(VCDNN.alphabetLen, 12, input_length=VCDNN.maxFeatureLen * VCDNN.alphabetLen))
self._model.add(keras.layers.LSTM(12))
self._model.add(keras.layers.Dense(len(VCDNN.conventions), activation=tf.nn.softmax)) # output layer
self._model.compile(tf.train.AdamOptimizer(self._LR), loss="sparse_categorical_crossentropy",
metrics=self._metrics)
Seems to start and immediately dies with no error (-1073740791)
python tensorflow keras deep-learning
add a comment |
up vote
1
down vote
favorite
The accuracy starts off at around 40% and drops down during one epoch to 25%
My model:
self._model = keras.Sequential()
self._model.add(keras.layers.Dense(12, activation=tf.nn.sigmoid)) # hidden layer
self._model.add(keras.layers.Dense(len(VCDNN.conventions), activation=tf.nn.softmax)) # output layer
optimizer = tf.train.AdamOptimizer(0.01)
self._model.compile(optimizer, loss=tf.losses.sparse_softmax_cross_entropy, metrics=["accuracy"])
I have 4 labels, 60k rows of data, split evenly for each label so 15k each and 20k rows of data for evaluation
my data example:
name label
abcTest label1
mete_Test label2
ROMOBO label3
test label4
The input is turned into integers for each character and then hot encoded and output is just turned into integers [0-3]
1 epoch evaluation (loss, acc):
[0.7436684370040894, 0.25]
UPDATE
More details about the data
The strings are of up to 20 characters
I first convert each character to int based on an alphabet dictionary (a: 1, b:2, c:3) and if a word is shorter than 20 chars i fill the rest with 0's now those values are hot encoded and reshaped so
assume max 5 characters
1. ["abc","d"]
2. [[1,2,3,0,0],[4,0,0,0,0]]
3. [[[0,1,0,0,0],[0,0,1,0,0],[0,0,0,1,0],[1,0,0,0,0],[1,0,0,0,0]],[[0,0,0,0,1],[1,0,0,0,0],[1,0,0,0,0],[1,0,0,0,0],[1,0,0,0,0]]]
4. [[0,1,0,0,0,0,0,1,0,0,0,0,0,1,0,1,0,0,0,0,1,0,0,0,0],[0,0,0,0,1,1,0,0,0,0,1,0,0,0,0,1,0,0,0,0,1,0,0,0,0]]
and labels describe the way a word is spelled basically naming convention e.g. all lowercase - unicase, testBest - camelCase, TestTest - PascalCase, test_test - snake_case
With added 2 extra layers and LR reduced to 0.001
Pic of training
Update 2
self._model = keras.Sequential()
self._model.add(
keras.layers.Embedding(VCDNN.alphabetLen, 12, input_length=VCDNN.maxFeatureLen * VCDNN.alphabetLen))
self._model.add(keras.layers.LSTM(12))
self._model.add(keras.layers.Dense(len(VCDNN.conventions), activation=tf.nn.softmax)) # output layer
self._model.compile(tf.train.AdamOptimizer(self._LR), loss="sparse_categorical_crossentropy",
metrics=self._metrics)
Seems to start and immediately dies with no error (-1073740791)
python tensorflow keras deep-learning
Are you sure that you are preserving the uppercase/lowercase state of the characters as you embed them into a number? Your example shows your final vector is of length 25 which implies you don't preserve this information... Also, why don't you simplify your problem and map your characters only to uppercase characters (0s for example), lowercase characters (1s for example), spaces and underscores (2s for example)?
– Tadej Magajna
Nov 19 at 13:25
add a comment |
up vote
1
down vote
favorite
up vote
1
down vote
favorite
The accuracy starts off at around 40% and drops down during one epoch to 25%
My model:
self._model = keras.Sequential()
self._model.add(keras.layers.Dense(12, activation=tf.nn.sigmoid)) # hidden layer
self._model.add(keras.layers.Dense(len(VCDNN.conventions), activation=tf.nn.softmax)) # output layer
optimizer = tf.train.AdamOptimizer(0.01)
self._model.compile(optimizer, loss=tf.losses.sparse_softmax_cross_entropy, metrics=["accuracy"])
I have 4 labels, 60k rows of data, split evenly for each label so 15k each and 20k rows of data for evaluation
my data example:
name label
abcTest label1
mete_Test label2
ROMOBO label3
test label4
The input is turned into integers for each character and then hot encoded and output is just turned into integers [0-3]
1 epoch evaluation (loss, acc):
[0.7436684370040894, 0.25]
UPDATE
More details about the data
The strings are of up to 20 characters
I first convert each character to int based on an alphabet dictionary (a: 1, b:2, c:3) and if a word is shorter than 20 chars i fill the rest with 0's now those values are hot encoded and reshaped so
assume max 5 characters
1. ["abc","d"]
2. [[1,2,3,0,0],[4,0,0,0,0]]
3. [[[0,1,0,0,0],[0,0,1,0,0],[0,0,0,1,0],[1,0,0,0,0],[1,0,0,0,0]],[[0,0,0,0,1],[1,0,0,0,0],[1,0,0,0,0],[1,0,0,0,0],[1,0,0,0,0]]]
4. [[0,1,0,0,0,0,0,1,0,0,0,0,0,1,0,1,0,0,0,0,1,0,0,0,0],[0,0,0,0,1,1,0,0,0,0,1,0,0,0,0,1,0,0,0,0,1,0,0,0,0]]
and labels describe the way a word is spelled basically naming convention e.g. all lowercase - unicase, testBest - camelCase, TestTest - PascalCase, test_test - snake_case
With added 2 extra layers and LR reduced to 0.001
Pic of training
Update 2
self._model = keras.Sequential()
self._model.add(
keras.layers.Embedding(VCDNN.alphabetLen, 12, input_length=VCDNN.maxFeatureLen * VCDNN.alphabetLen))
self._model.add(keras.layers.LSTM(12))
self._model.add(keras.layers.Dense(len(VCDNN.conventions), activation=tf.nn.softmax)) # output layer
self._model.compile(tf.train.AdamOptimizer(self._LR), loss="sparse_categorical_crossentropy",
metrics=self._metrics)
Seems to start and immediately dies with no error (-1073740791)
python tensorflow keras deep-learning
The accuracy starts off at around 40% and drops down during one epoch to 25%
My model:
self._model = keras.Sequential()
self._model.add(keras.layers.Dense(12, activation=tf.nn.sigmoid)) # hidden layer
self._model.add(keras.layers.Dense(len(VCDNN.conventions), activation=tf.nn.softmax)) # output layer
optimizer = tf.train.AdamOptimizer(0.01)
self._model.compile(optimizer, loss=tf.losses.sparse_softmax_cross_entropy, metrics=["accuracy"])
I have 4 labels, 60k rows of data, split evenly for each label so 15k each and 20k rows of data for evaluation
my data example:
name label
abcTest label1
mete_Test label2
ROMOBO label3
test label4
The input is turned into integers for each character and then hot encoded and output is just turned into integers [0-3]
1 epoch evaluation (loss, acc):
[0.7436684370040894, 0.25]
UPDATE
More details about the data
The strings are of up to 20 characters
I first convert each character to int based on an alphabet dictionary (a: 1, b:2, c:3) and if a word is shorter than 20 chars i fill the rest with 0's now those values are hot encoded and reshaped so
assume max 5 characters
1. ["abc","d"]
2. [[1,2,3,0,0],[4,0,0,0,0]]
3. [[[0,1,0,0,0],[0,0,1,0,0],[0,0,0,1,0],[1,0,0,0,0],[1,0,0,0,0]],[[0,0,0,0,1],[1,0,0,0,0],[1,0,0,0,0],[1,0,0,0,0],[1,0,0,0,0]]]
4. [[0,1,0,0,0,0,0,1,0,0,0,0,0,1,0,1,0,0,0,0,1,0,0,0,0],[0,0,0,0,1,1,0,0,0,0,1,0,0,0,0,1,0,0,0,0,1,0,0,0,0]]
and labels describe the way a word is spelled basically naming convention e.g. all lowercase - unicase, testBest - camelCase, TestTest - PascalCase, test_test - snake_case
With added 2 extra layers and LR reduced to 0.001
Pic of training
Update 2
self._model = keras.Sequential()
self._model.add(
keras.layers.Embedding(VCDNN.alphabetLen, 12, input_length=VCDNN.maxFeatureLen * VCDNN.alphabetLen))
self._model.add(keras.layers.LSTM(12))
self._model.add(keras.layers.Dense(len(VCDNN.conventions), activation=tf.nn.softmax)) # output layer
self._model.compile(tf.train.AdamOptimizer(self._LR), loss="sparse_categorical_crossentropy",
metrics=self._metrics)
Seems to start and immediately dies with no error (-1073740791)
python tensorflow keras deep-learning
python tensorflow keras deep-learning
edited Nov 20 at 9:10
functor
644
644
asked Nov 19 at 11:55
Higeath
153
153
Are you sure that you are preserving the uppercase/lowercase state of the characters as you embed them into a number? Your example shows your final vector is of length 25 which implies you don't preserve this information... Also, why don't you simplify your problem and map your characters only to uppercase characters (0s for example), lowercase characters (1s for example), spaces and underscores (2s for example)?
– Tadej Magajna
Nov 19 at 13:25
add a comment |
Are you sure that you are preserving the uppercase/lowercase state of the characters as you embed them into a number? Your example shows your final vector is of length 25 which implies you don't preserve this information... Also, why don't you simplify your problem and map your characters only to uppercase characters (0s for example), lowercase characters (1s for example), spaces and underscores (2s for example)?
– Tadej Magajna
Nov 19 at 13:25
Are you sure that you are preserving the uppercase/lowercase state of the characters as you embed them into a number? Your example shows your final vector is of length 25 which implies you don't preserve this information... Also, why don't you simplify your problem and map your characters only to uppercase characters (0s for example), lowercase characters (1s for example), spaces and underscores (2s for example)?
– Tadej Magajna
Nov 19 at 13:25
Are you sure that you are preserving the uppercase/lowercase state of the characters as you embed them into a number? Your example shows your final vector is of length 25 which implies you don't preserve this information... Also, why don't you simplify your problem and map your characters only to uppercase characters (0s for example), lowercase characters (1s for example), spaces and underscores (2s for example)?
– Tadej Magajna
Nov 19 at 13:25
add a comment |
2 Answers
2
active
oldest
votes
up vote
1
down vote
accepted
The 0.25 acc means the model couldn't learn anything useful as it is same as the random guess. This means the network structure may not good for the problem.
Currently, the recurring neural network, like LSTM, is more commonly used for sequence modeling. For instance:
model = Sequential()
model.add(Embedding(char_size, embedding_size))
model.add(LSTM(hidden_size))
model.add(Dense(len(VCDNN.conventions), activation='softmax'))
This will work better if the label is related to the char sequence information about the input words.
add a comment |
up vote
0
down vote
This means your models isn't really learning anything useful. It might be stuck in a local minima. This could be due to following reasons:
- a) you don't have enough train data to train a neural network. NNs usually require fairly large datasets to converge. Try using a RandomForest classifier at first to see what results you can get there
- b) it's possible your target data might not have anything to do with your train data and so it's impossible to train such a model that would map efficiently without overfitting
- c) your model could do with some improvements
If you want to give improving your model a go I would add a few extra dense layers with a few more units. So after line 2 of your model I'd add:
self._model.add(keras.layers.Dense(36, activation=tf.nn.sigmoid))
self._model.add(keras.layers.Dense(36, activation=tf.nn.sigmoid))
Another thing you can try is a different learning rate. I'd go with the default for AdamOptimizer which is 0.001. So just change 0.01
to 0.001
in the AdamOptimizer() call
You may also want to train more than just one epoch
1
Thanks for response, i've added the layers and am retraining it right now (though seems to be going down to 25% again). I've updated the post with more information about the data, that might clarify what am I doing wrong? Also the data is sorted so it goes label by label, that could have an impact, i believe that fit by default mixes data though
– Higeath
Nov 19 at 12:32
add a comment |
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
accepted
The 0.25 acc means the model couldn't learn anything useful as it is same as the random guess. This means the network structure may not good for the problem.
Currently, the recurring neural network, like LSTM, is more commonly used for sequence modeling. For instance:
model = Sequential()
model.add(Embedding(char_size, embedding_size))
model.add(LSTM(hidden_size))
model.add(Dense(len(VCDNN.conventions), activation='softmax'))
This will work better if the label is related to the char sequence information about the input words.
add a comment |
up vote
1
down vote
accepted
The 0.25 acc means the model couldn't learn anything useful as it is same as the random guess. This means the network structure may not good for the problem.
Currently, the recurring neural network, like LSTM, is more commonly used for sequence modeling. For instance:
model = Sequential()
model.add(Embedding(char_size, embedding_size))
model.add(LSTM(hidden_size))
model.add(Dense(len(VCDNN.conventions), activation='softmax'))
This will work better if the label is related to the char sequence information about the input words.
add a comment |
up vote
1
down vote
accepted
up vote
1
down vote
accepted
The 0.25 acc means the model couldn't learn anything useful as it is same as the random guess. This means the network structure may not good for the problem.
Currently, the recurring neural network, like LSTM, is more commonly used for sequence modeling. For instance:
model = Sequential()
model.add(Embedding(char_size, embedding_size))
model.add(LSTM(hidden_size))
model.add(Dense(len(VCDNN.conventions), activation='softmax'))
This will work better if the label is related to the char sequence information about the input words.
The 0.25 acc means the model couldn't learn anything useful as it is same as the random guess. This means the network structure may not good for the problem.
Currently, the recurring neural network, like LSTM, is more commonly used for sequence modeling. For instance:
model = Sequential()
model.add(Embedding(char_size, embedding_size))
model.add(LSTM(hidden_size))
model.add(Dense(len(VCDNN.conventions), activation='softmax'))
This will work better if the label is related to the char sequence information about the input words.
answered Nov 19 at 13:58
functor
644
644
add a comment |
add a comment |
up vote
0
down vote
This means your models isn't really learning anything useful. It might be stuck in a local minima. This could be due to following reasons:
- a) you don't have enough train data to train a neural network. NNs usually require fairly large datasets to converge. Try using a RandomForest classifier at first to see what results you can get there
- b) it's possible your target data might not have anything to do with your train data and so it's impossible to train such a model that would map efficiently without overfitting
- c) your model could do with some improvements
If you want to give improving your model a go I would add a few extra dense layers with a few more units. So after line 2 of your model I'd add:
self._model.add(keras.layers.Dense(36, activation=tf.nn.sigmoid))
self._model.add(keras.layers.Dense(36, activation=tf.nn.sigmoid))
Another thing you can try is a different learning rate. I'd go with the default for AdamOptimizer which is 0.001. So just change 0.01
to 0.001
in the AdamOptimizer() call
You may also want to train more than just one epoch
1
Thanks for response, i've added the layers and am retraining it right now (though seems to be going down to 25% again). I've updated the post with more information about the data, that might clarify what am I doing wrong? Also the data is sorted so it goes label by label, that could have an impact, i believe that fit by default mixes data though
– Higeath
Nov 19 at 12:32
add a comment |
up vote
0
down vote
This means your models isn't really learning anything useful. It might be stuck in a local minima. This could be due to following reasons:
- a) you don't have enough train data to train a neural network. NNs usually require fairly large datasets to converge. Try using a RandomForest classifier at first to see what results you can get there
- b) it's possible your target data might not have anything to do with your train data and so it's impossible to train such a model that would map efficiently without overfitting
- c) your model could do with some improvements
If you want to give improving your model a go I would add a few extra dense layers with a few more units. So after line 2 of your model I'd add:
self._model.add(keras.layers.Dense(36, activation=tf.nn.sigmoid))
self._model.add(keras.layers.Dense(36, activation=tf.nn.sigmoid))
Another thing you can try is a different learning rate. I'd go with the default for AdamOptimizer which is 0.001. So just change 0.01
to 0.001
in the AdamOptimizer() call
You may also want to train more than just one epoch
1
Thanks for response, i've added the layers and am retraining it right now (though seems to be going down to 25% again). I've updated the post with more information about the data, that might clarify what am I doing wrong? Also the data is sorted so it goes label by label, that could have an impact, i believe that fit by default mixes data though
– Higeath
Nov 19 at 12:32
add a comment |
up vote
0
down vote
up vote
0
down vote
This means your models isn't really learning anything useful. It might be stuck in a local minima. This could be due to following reasons:
- a) you don't have enough train data to train a neural network. NNs usually require fairly large datasets to converge. Try using a RandomForest classifier at first to see what results you can get there
- b) it's possible your target data might not have anything to do with your train data and so it's impossible to train such a model that would map efficiently without overfitting
- c) your model could do with some improvements
If you want to give improving your model a go I would add a few extra dense layers with a few more units. So after line 2 of your model I'd add:
self._model.add(keras.layers.Dense(36, activation=tf.nn.sigmoid))
self._model.add(keras.layers.Dense(36, activation=tf.nn.sigmoid))
Another thing you can try is a different learning rate. I'd go with the default for AdamOptimizer which is 0.001. So just change 0.01
to 0.001
in the AdamOptimizer() call
You may also want to train more than just one epoch
This means your models isn't really learning anything useful. It might be stuck in a local minima. This could be due to following reasons:
- a) you don't have enough train data to train a neural network. NNs usually require fairly large datasets to converge. Try using a RandomForest classifier at first to see what results you can get there
- b) it's possible your target data might not have anything to do with your train data and so it's impossible to train such a model that would map efficiently without overfitting
- c) your model could do with some improvements
If you want to give improving your model a go I would add a few extra dense layers with a few more units. So after line 2 of your model I'd add:
self._model.add(keras.layers.Dense(36, activation=tf.nn.sigmoid))
self._model.add(keras.layers.Dense(36, activation=tf.nn.sigmoid))
Another thing you can try is a different learning rate. I'd go with the default for AdamOptimizer which is 0.001. So just change 0.01
to 0.001
in the AdamOptimizer() call
You may also want to train more than just one epoch
answered Nov 19 at 12:06
Tadej Magajna
1,0901232
1,0901232
1
Thanks for response, i've added the layers and am retraining it right now (though seems to be going down to 25% again). I've updated the post with more information about the data, that might clarify what am I doing wrong? Also the data is sorted so it goes label by label, that could have an impact, i believe that fit by default mixes data though
– Higeath
Nov 19 at 12:32
add a comment |
1
Thanks for response, i've added the layers and am retraining it right now (though seems to be going down to 25% again). I've updated the post with more information about the data, that might clarify what am I doing wrong? Also the data is sorted so it goes label by label, that could have an impact, i believe that fit by default mixes data though
– Higeath
Nov 19 at 12:32
1
1
Thanks for response, i've added the layers and am retraining it right now (though seems to be going down to 25% again). I've updated the post with more information about the data, that might clarify what am I doing wrong? Also the data is sorted so it goes label by label, that could have an impact, i believe that fit by default mixes data though
– Higeath
Nov 19 at 12:32
Thanks for response, i've added the layers and am retraining it right now (though seems to be going down to 25% again). I've updated the post with more information about the data, that might clarify what am I doing wrong? Also the data is sorted so it goes label by label, that could have an impact, i believe that fit by default mixes data though
– Higeath
Nov 19 at 12:32
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53374114%2ftensorflow-accuracy-stuck-at-25-for-4-labels-text-classification%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Are you sure that you are preserving the uppercase/lowercase state of the characters as you embed them into a number? Your example shows your final vector is of length 25 which implies you don't preserve this information... Also, why don't you simplify your problem and map your characters only to uppercase characters (0s for example), lowercase characters (1s for example), spaces and underscores (2s for example)?
– Tadej Magajna
Nov 19 at 13:25