Thunderx Thunderx - 3 months ago 25
HTTP Question

How to make a Mutlithreded idhttp calls to do work on a StringList

i am new to Threads, i have a List contains a strings. My goal is to make multiple threads do work to this List, this codes only for a single thread because i'm learning currently, however i get AV when i press start Button.

type
TDemoThread = class(TThread)
private
procedure Abort;
protected
procedure Execute; override;
public
List: TStringList;
end;

procedure TfrmMain.StartButton1Click(Sender: TObject);
var
i: integer;
List: Tstrings;
begin
for i := 0 to memo1.Lines.Count - 1 do
begin
List := TStringList.Create;
List.Add(memo1.Lines.Strings[i]);
end;

Thread := TDemoThread.Create(True);
Thread.FreeOnTerminate := True;
Thread.Start;
end;

procedure TDemoThread.Execute;
var
lHTTP: TIdHTTP;
i: integer;
X: Tstrings;
begin
inherited;
if Terminated then
Exit;

lHTTP := TIdHTTP.Create(nil);
X := TStringList.Create;
lHTTP.ReadTimeout := 30000;
lHTTP.HandleRedirects := True;

for i := 0 to List.Count - 1 do
try
X.Text := lHTTP.Get('https://instagram.com/' + List.Strings[i]);
S := ExtractDelimitedString(X.Text);
X.Clear;
TThread.Synchronize(nil,
procedure
begin
frmMain.Memo2.Lines.Add(List.Strings[i] + ' : ' + S);
end);
finally
end;
end;

Answer

Personally I'd avoid updating the form from the threads themselves. Threads are data generators here, not GUI managers. So let them separate their concerns.

I'd make all the threads accumulate the results into the same shared container and then make a GUI thread to poll that container instead. Human eyes are slow and Windows GUI is slow too, so you should not update your GUI more often than 2 or 3 times per second. It would only waste CPU load and blur the form into being unreadable.

Another thing would be to avoid using slow TStringList unless its extra functionality (which makes it slow) is required. The regular TList<string> is more than enough as a dumb container and is faster.

type 
  TDemoThread = class;

  TfrmMain = class(TForm)
  private
    Fetchers: TThreadList<TDemoThread>;
    Data:     TThreadList<string>;

    property inProcess: Boolean read ... write SetInProcess;
  public
    procedure AfterConstruction; override;
    procedure BeforeDestruction; override;
  ....
  end;

  // this demo makes each thread per each line - that is actually a bad design
  // one better use a thread pool working over the same queue and only have
  // 20-40 worker threads for all the URLs
  TDemoThread = class(TThread)
  private
    URL: string;  
    List: TThreadList<string>;
    Tracker: TThreadList<TDemoThread>;
  protected
    procedure Execute; override;
  end;

procedure TfrmMain.BeforeDestruction;
begin
  while TThreadList.Count > 0 do
    Sleep(100);

  FreeAndNil( Fetchers );
  Data.Free;

  inherited;
end;

procedure TfrmMain.AfterConstruction;
begin
  Fetchers := TThreadList<TDemoThread>.Create;
  Data :=     TThreadList<string>.Create; 
  inherited;
end;

procedure TfrmMain.StartButton1Click(Sender: TObject);
var
  i: integer;
  List: Tstrings;
  worker: TDemoThread;
  URL: string;
begin
  If inProcess then exit;

  for URL in memo1.Lines do begin
    worker := TDemoThread.Create(True);  
    worker.FreeOnTerminate := True;
    worker.URL := URL;
    worker.List := Data;
    worker.Tracker := Fetchers;
    Fetchers.Add( worker );
  end;

  InProcess := True;

  for worker in Fetchers do
    worker.Start;
end;

procedure TfrmMain.SetInProcess(const Value: Boolean);
begin
  if Value = InProcess then exit; // form already is in this mode

  FInProcess := Value;

  memo1.ReadOnly := Value;
  StartButton.Enabled := not Value;
  if Value then begin
     Memo2.Lines.Clear;
     Data.Clear;
  end;

  Timer1.Delay := 500; // twice per second
  Timer1.Enabled := Value;

  If not Value then  // for future optimisation - make immediate mode change 
     FlushData;      // when last worker thread quits, no waiting for timer event

  If not Value then
     ShowMessage('Work complete');
end;

procedure TfrmMain.Timer1Timer(const Sender: TObject);
begin
  FlushData;

  if Fetchers.Count <= 0 then
     InProcess := False;
end;

procedure TfrmMain.FlushData;
begin
  Data.LockList;  // next two operations should go as non-interruptible atom
  try
    Memo2.Lines.AddStrings( Data.ToArray() );
    Data.Clear;
  finally
    Data.UnLockList;
  end;
end;

procedure TDemoThread.Execute;
var
  lHTTP: TIdHTTP;
begin
  try 
    lHTTP := TIdHTTP.Create(nil);
    try
      lHTTP.ReadTimeout := 30000;
      lHTTP.HandleRedirects := True;

      S := ExtractDelimitedString( lHTTP.Get('https://instagram.com/' + URL) );

      List.Add( S );
    finally
      lHTTP.Destroy;
    end;
  finally
    Tracker.Remove( Self );
  end;
end;

PS. Personally, I'd also use OmniThreads Library, as it generally makes maintaining data-generating threads easier. For example just managing how many threads did you created becomes setting one property and determining when all threads complete their work is another oneliner. You really should not create a thousand of threads to fetch all the URLs, instead you should have 10-20 threads in a Thread Pool that would take the URLs from a Input Queue and fetch them one after another. I suggest you reading about OTL's Parallel For and Fork-Join patterns at http://otl.17slon.com/tutorials.htm - it would allow making such an application more concise and easier to write. Pipeline pattern would probably be even better match for this task - since you anyway prepare URLs list as a source collection. Half the scaffolding in StartButtonClick would be gone, and the whole TDemoThread class too.