Wednesday, March 21, 2012

Distinct & eliminate duplicate data

Hi there.. Although i know my questions will sound quite stupid, but i've tried n tried n i can't get the desired output.. so here goes...

I'm using SQL Server 2000. I have one table call SMSReceived. in that table, there's a lot of columns. But i need to display only 2 column which is column Sender and column Message. Since this is our project, we had a lot of data with the same number in the column Sender. I want to display the data but I only want the Sender number only once. (no redundant data) So i try my query like this :

select distinct Sender from SmsReceived <-- this works since they are no redundant data

then i tried 2 columns which is :

select distinct Sender, Message from SmsReceived

then, but the Sender number still redundant.. It's like distinct is not functioning. I think it is because of the different message.

So, can anyone help me with this?.. I need to display 2 columns but only one Sender number. you can eliminate either one data as long as the Sender number didn't appear twice.

Erm... i tried group by, but an error occured like this..

Column 'Message' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

Sender, Message
1234, hi how are you?
1234, r u ok?
7777, my name is
7777, hello

yeah, it is something like this.. But i need to eliminate either one...

So the result will become like this..

1234, hi, how are you?
7777,hello

Is it possible?.. Please help me... :( :(select Sender, min(Message) as lowest_message
from SmsReceived
group by Sender:)|||thanks..it already works...

select dnyFrom as 'SMS Sender', min(dnyReason) as 'Reason', min(dnySMSText) as 'SMS Text', min(dnycustomerid) as 'Customer Id'
from shldnysmsreceived
where dnycustomerid='PRM' and
dnycustomerid is not null and
dnyDateReceived between '2006-12-01 00:00:00.000' and '2007-07-01 00:00:00.000'
group by all dnyFrom
order by min(dnySMSText) asc

and one more thing... for example if i had data that looks like this

1234, successful
1234, invalid message
4444, winner of prize
4444, successful
4444, invalid format
6666, hello
6666, bulk sms

i need to display only the data with 'successful'. so the output will look like this

1234, successful
4444, successful
6666, hello <-- can choose either one (hello or bulk sms)|||the solution i gave you was for two columns, one for the GROUP BY and the other for the MIN() function

you do realize that by having multiple MIN() expressions, the resulting values are not necessarily from the same row, don't you?|||No, i don't know that... why we can't have multiple expression? Can u explain by giving examples? .. I really dont understand :(|||oh, you can have multiple MIN() expressions, that's not the problem

example:

tom 12 ggg 105
tom 11 hhh 108
tom 15 xxx 104
bob 16 bbb 112
bob 13 ccc 109
bob 17 aaa 137

assume GROUP BY on the first column, with MIN on the others, the resuts are:

tom 11 ggg 104
bob 13 aaa 109

see? good results from the query, but these do not represent actual rows from the table|||owh.. so how am i going to choose 'successful' only from other data? i don't know how to use if statement..|||you would probably use a CASE expression

but i really do not understand what you are trying to do

your example data uses two columns, but your query has four|||owh.. i was just trying to make it easy for the readers to understand.. the actual data have 4 or more columns that needed to be displayed. but for the moment, i only need to know how to display 2 columns with conditions that already stated above... and i don't know how to use CASE.. :(|||if i give you the solution for only two columns, it will be of very little use to you...

... so i won't, sorry

for an example of the CASE expression, see http://www.sqlzoo.net/fun_case|||Thanks everyone..I already solve it..using case...|||hi.. i have some query that i want to solve. i had a data that looks like this. (it's actually a text messages where people sms in to win prizes) and it have to start with 'PRM'

SMSReceived <-- column name
Prm 65719471
Prm 68516237
PRM 72847410
Prm 75031193
PRM (SPACE) PINSEND TO32080
prm (space)36398226
PRM < > 22733564
Prm < 82848916 > pin
PRM <36943554>

how can i choose only the number? so the desired output will look like this..
65719471
68516237
72847410
75031193
32080
36398226
22733564
82848916
36943554

please help..|||While most SQL engines can do this kind of text manipulation, the SQL standard doesn't provide any standard way that I know to acomplish this. In other words, you can use the vendor specific extensions to get this done, but I don't know of anything in the SQL language definition that will give you a generic solution that will work on any SQL implementation.

I'd post this question in the appropriate vendor/engine specific forum to get an answer (if I didn't already know how to solve it using the engine of choice).

-PatP|||I would recommend REGEXP in MySQL but i know you're not using MySQL. I'm sure there is a MSSQL equivalent though. More importantly I think you should be stripping this information down before it goes into the database...

ever thought that someone can SQL inject from a text message?|||Isn't this exactly why your answer should be classed as incorrect if you don't follow a set pattern.
E.g. "Text PRM then a space then your answer to 55555"
If your results were always consistent it'd be easy as pie to extract just the number partsql

No comments:

Post a Comment