WTF ... IS WTF!?
We are a collective of people who believe in freedom of speech, the rights of individuals, and free pancakes! We share our lives, struggles, frustrations, successes, joys, and prescribe to our own special brand of humor and insanity. If you are looking for a great place to hang out, make new friends, find new nemeses, and just be yourself, WTF.com is your new home.

Sanitizing Text Input

Blaze Daily

<b>Banned - What an Asshat!</b>
146
0
0
#1
This is a hypothetical situation. Suppose you own a web forum, and want to sanitise the input for entry into a database, prepare it for HTML output, and remove any dagerous HTML tags.
//remove the user's HTML tags.
$post = htmlspecialchars($post);

//Prepare for database entry
$post = addslashes($post);

//Format newlines for HTML display
$sting = chr(13);
$post = ereg_replace($string, "<BR>", $post);
The above code can be modified and bettered in many ways, this code is only for example.

Each part of the requirements has now been fulfilled to a certain extent. There is the question of whether binary data was submitted to the database, and the escape_real_string function would be needed for binary. However, a simple conditional statement will suffice for indentification of binary data.

I've seen many ways of sanitixing text input from other users over at www.php.net

Some are fairly imaginative and impressive coding. I've actually been learning a lot from surfing that site, notjust the dictionary words but the syntax and flow of PHP code in general. It's a very laid back language with masses of potential for manipulation of data.
 

Brain Spout

Wizard No More
4,503
102
177
#2
Blaze Daily said:
This is a hypothetical situation...
no offense, but considering in your profile you have listed as your homepage another forum site and you also ask a lot of questions about how to do certain things related to creating a forum and what not, i don't really think that what you're asking is hypothetical. that being said i have no idea how to fix your problem/do whatever it is you're doing
 

Blaze Daily

<b>Banned - What an Asshat!</b>
146
0
0
#3
That is a rather presumptious preposition. However inclined I am not apologising for my initial content. You ought to know better for a grown man, so high, really stowned. How about you, big daddy?
 

Brain Spout

Wizard No More
4,503
102
177
#4
Blaze Daily said:
That is a rather presumptious preposition. However inclined I am not apologising for my initial content. You ought to know better for a grown man, so high, really stowned. How about you, big daddy?
what are you talking about?
*presumptuous
preposition, wtf? i think you meant proposition which IMO is a rather awkward usage of the word, from there whatever you are saying confuses me more and more. i feel like i'm stoned reading it.


i won't siderail this discussion anymore than i already have
 

Jung

???
Premium
13,979
1,397
487
#5
No offense, man, but your code is horrible (all of it, in all your threads). Do yourself a favor and buy a PHP book.

Also, addslashes/stripslashes is not a safe way to sanitize input; The strings that you get as a result of addslashes are still unsafe for use in any SQL query because they haven't been escaped with the database-specific escape function such as mysql_real_escape_string.


If you're using magicqoutes and register_globals you should turn that shit off as well.
 

jamesp

In Memory...
1,714
1
0
#6
Seriously, every time you can't figure out how to do something on your website, Google it, dont just post multiple threads here about it. Thats not how you learn to program, you've just got to figure it out.
 

BRiT

CRaZY
Founder
11,661
2,402
487
#7
Gawd all of that is ugly code! Maybe it's how scripting/hack languages are setup or maybe you're just a hack... I can't really tell.

You should isolate your database sanitizing inside the DAO/Accessor layer. Even when accessing the database at the lowest level possible, can you not use bind parameters to SQL so it will in no way evaluate your data?

PreparedStatement ps = dbConnection.prepareStatement("update T set C = ? where PK = ?");
ps.setString(1, columnData);
ps.setLong(2, pkId);
ps.execute();
Instead of:
String sqlString = "update TABLE set COLUMN = '" + columnData +"' where PK = '" + pkId + "'";
Statement s = dbConnection.createStatement();
s.execute(sqlString);
Also, is there nothing like a data-access library (equivalent of Hibernate/JDO) in these hack-languages?

You should also keep the formatting isolated on the view side, that way you can have HTML/XHTML/XML presentations.
 

Blaze Daily

<b>Banned - What an Asshat!</b>
146
0
0
#8
BRiT said:
Gawd all of that is ugly code! Maybe it's how scripting/hack languages are setup or maybe you're just a hack... I can't really tell.

You should isolate your database sanitizing inside the DAO/Accessor layer. Even when accessing the database at the lowest level possible, can you not use bind parameters to SQL so it will in no way evaluate your data?

Instead of:

Also, is there nothing like a data-access library (equivalent of Hibernate/JDO) in these hack-languages?

You should also keep the formatting isolated on the view side, that way you can have HTML/XHTML/XML presentations.
What language is this? I'm using PHP...lol...I dunno about the DAO....Are you sure that code snippit you provided would remove HTML tags, <BR /> newlines, and also prepare the data for mySql entry...?

I hear mysql_real_escape_string is pretty good for mysql data prep...but it needs stripslashes to present the data effectively, no?
 

Jung

???
Premium
13,979
1,397
487
#9

Blaze Daily

<b>Banned - What an Asshat!</b>
146
0
0
#10
junglizm said:
Ugh...

$post = stripslashes(mysql_real_escape_string(trim($post), $db));


Two database abstraction layers I know I've recommended before...

http://pear.php.net/package/DB

http://adodb.sourceforge.net
That's a really neat peice of code there, Junglizm. Dunno what trim does, or why it needs that $db ting. Gonna read up when I come down on php.net

What I don't understand is using mysql_real_escape string in that snippet of code? Does calling the same command strip/add the escape characters?

For example, I converted the forum robbiedave.com to use mysql_real_escape_string, and I removed addslashes/stripslashes. But when the text was presented in HTML, it had many backslashes. Therefore, I had to include strip slashes before outputting the text to the browser. Now it works fine. Are you suggesting that as well as strip slashes command, I should also be using the real_escape_string command in conjunction with strip slashes?

That's fascinating if its what you're saying...
 

jamesp

In Memory...
1,714
1
0
#11
Blaze Daily said:
What language is this? I'm using PHP...lol...I dunno about the DAO....Are you sure that code snippit you provided would remove HTML tags, <BR /> newlines, and also prepare the data for mySql entry...?

I hear mysql_real_escape_string is pretty good for mysql data prep...but it needs stripslashes to present the data effectively, no?
I believe he is using ASP.NET.

But I disagree about modifying anything at the DAO layer. Just my preference to do it in the PHP.

Try something like this:

$post = htmlspecialchars( stripslashes($post));
 

Blaze Daily

<b>Banned - What an Asshat!</b>
146
0
0
#12
jamesp said:
I believe he is using ASP.NET.

But I disagree about modifying anything at the DAO layer. Just my preference to do it in the PHP.

Try something like this:

$post = htmlspecialchars( stripslashes($post));
Jamesp - now there is a peice of code a understand ;) The whole embedded command syntax really is something I have yet to try out.

I think we are at loggerheads as to where to enduce the htmlspecialchars command. I see most people are about preserving data, and your technique as to convert the html after entry into the database, rather than converting the string before inputting into the db.


At the moment, I have my forum set to use htmlspecialchars() instead of string_tags, because of data integrity.

Also, from Junglizm's advice, I switched over from addslashes to mysql_real_escape_string to escape any conflicting characters that would fuck up i/o.

However, after implimenting mysql_real_escape_string() I found that stripslashes was still needed to remove backslashes mysql_r_e_s inputted into the the text string. Strange no? However, Junglizm confused me when he wrote using mres with stripslashes - as surely that's an coding oxymoron, innit? Does mysql_escape_reaL_String work both ways?
 

Brain Spout

Wizard No More
4,503
102
177
#13
take jung's advice and buy a good book on the subject. first a broad book, and then a book that focusess more on what you're doing. you aren't going to learn how do any of if by reading forums, and if you do somehow learn it, it will be messy and other programmers will hate you
 

Blaze Daily

<b>Banned - What an Asshat!</b>
146
0
0
#14
WizardlyFriend said:
take jung's advice and buy a good book on the subject. first a broad book, and then a book that focusess more on what you're doing. you aren't going to learn how do any of if by reading forums, and if you do somehow learn it, it will be messy and other programmers will hate you
I brought a book for programming PHP and Mysql with Apache. This, and the Linux man pages (which is basically PHP net) is how I learned enough to code robbiedave.com

I've been programming various flavours of basic for over a decade now, so I do not apologise if my coding practises seem rather unusual to you.

However, I enjoy discussion on the subject of programming, whatever the language. In fact, I'm really enjoying PHP graphics so much that I'm thinking of converting some of the routines to C, and making animations to help me understand the data sequences better.

I haven't programmed in C in years...lol...I wonder how it's changed?
 

Brain Spout

Wizard No More
4,503
102
177
#15
i don't know PHP at all, so i can't say whether or not your code is messy, i just know from my experience with other languages that when you look at another person's code and it is messy, you hate them for not taking the time to do something properly or at least neatly
 

Blaze Daily

<b>Banned - What an Asshat!</b>
146
0
0
#16
WizardlyFriend said:
i don't know PHP at all, so i can't say whether or not your code is messy, i just know from my experience with other languages that when you look at another person's code and it is messy, you hate them for not taking the time to do something properly or at least neatly
$post = htmlspecialchars($post);

//Prepare for database entry
$post = addslashes($post);

//Format newlines for HTML display
$sting = chr(13);
$post = ereg_replace($string, "<BR>", $post);
This would be written;

$post = htmlspecialchars(mysql_real_escape_string($post));
$post = ereg_replace(chr(13), "<br />", $post);
The problem I find with embedded code, is the execution sequence. I prefer to code the 'long way' just for clarity. I also am a stickler for defining every variable, even when some programmers believe certain data does not need a variable. I just think code is more ledgible with everything accounted for, and operational.

Personally, I don't have to worry about clock cycles when working on a personal website, and that's when programming really gets sticky!

Thanks for the input nevertheless, I still find programming a beautiful experience to behold every time I write a script - however patchy it is to begin. Coding for me is a refining process, from which blocks of code are developed over and over again. robbiedave.com is a growing site under development. I'm always tweaking routines, and changing the code until I feel satisfied. It's just fun for me, as I am not employed as a programmer (thx God) lol

Yeah, just a whole lot of entertainment learning new languages. PHP is definitely very powerful, and could be used for great things.
 

jamesp

In Memory...
1,714
1
0
#17
Blaze Daily said:
Jamesp - now there is a peice of code a understand ;) The whole embedded command syntax really is something I have yet to try out.

I think we are at loggerheads as to where to enduce the htmlspecialchars command. I see most people are about preserving data, and your technique as to convert the html after entry into the database, rather than converting the string before inputting into the db.


At the moment, I have my forum set to use htmlspecialchars() instead of string_tags, because of data integrity.

Also, from Junglizm's advice, I switched over from addslashes to mysql_real_escape_string to escape any conflicting characters that would fuck up i/o.

However, after implimenting mysql_real_escape_string() I found that stripslashes was still needed to remove backslashes mysql_r_e_s inputted into the the text string. Strange no? However, Junglizm confused me when he wrote using mres with stripslashes - as surely that's an coding oxymoron, innit? Does mysql_escape_reaL_String work both ways?
*SNaps Fingers!* That was it, I knew there was a better alternative to stripslashes, I just couldnt think of it.
 

BRiT

CRaZY
Founder
11,661
2,402
487
#18
jamesp said:
I believe he is using ASP.NET.

But I disagree about modifying anything at the DAO layer. Just my preference to do it in the PHP.

Try something like this:

$post = htmlspecialchars( stripslashes($post));
If anyone thought I was advocating for the DAO/Accessor to "change" the data, you misunderstood. I was showing an abstraction of how to access the database layer securely.

I'm a firm believer in proper design and seperation of layers. You should have the text-input layer doing the sanitization. You should have the DAO/Accessor do the reading/writing/management of the data.

The database layer (DAO/Accessor) should do nothing but securely insert it into the database unchanged -- at least from the standpoint that: data == dbReadInfo(dbWriteInfo(data)); Specificly the data stored and retrieved is identical to the original data.

Also, the data stored should be neutral enough so that the presentation layer can transform it to HTML/DHTML/XML/TEXT/JSON or what have you.

The abstraction layer shown was borrowed from the Java implementation in the java.sql package.