I'm looking for recommendations as to the best way forward for my current machine learning problem
The outline of the problem and what I've done is as follows:
I would first try a grid search over the parameter space but while also using a k-fold cross-validation on training set (and keeping the test set to the side of course). Then pick the set of parameters than generalize the best from the k-fold cross validation. I suggest using GridSearchCV with StratifiedKFold (it's already the default strategy for GridSearchCV when passing a classifier as estimator).
Hypothetically an SVM with rbf can perfectly fit any training set as VC dimension is infinite. So if tuning the parameters doesn't help reduce overfitting then you may want to try a similar parameter tuning strategy for a simpler hypothesis such as a linear SVM or another classifier you think may be appropriate for your domain.
Regularization as you mentioned is definitely a good idea if its available.
The prediction of the same label makes me think that label imbalance may be an issue and for this case you could use different class weights. So in the case of an SVM each class gets its own C penalty weight. Some estimators in sklearn accept fit params that allow you to set a sample weights to set the amount of penalty for individual training samples.
Now if you think the features may be an issue I would use feature selection by looking at F-values provided by f_classif and could be use with something like SelectKBest. Another option would be recursive feature elimination with cross validation. Feature selection can be wrapped into a grid search as well if you use sklearns Pipeline API.